ANY-1 v3 Instruction Set

© 2021 Robert Finch

Table of Contents

[Instruction Formats 7](#_Toc75218766)

[Register Specifiers 7](#_Toc75218767)

[Constant Interpretation for Float Instructions 7](#_Toc75218768)

[Vector Instruction Indicator 8](#_Toc75218769)

[Root Opcode 8](#_Toc75218770)

[Extended Immediate 8](#_Toc75218771)

[Register Formats 9](#_Toc75218772)

[R1 (one source register) 9](#_Toc75218773)

[R2 (two source register) 9](#_Toc75218774)

[Branch Instructions 10](#_Toc75218775)

[Instruction Modifiers 10](#_Toc75218776)

[IMOD Instruction Modifier - 58 10](#_Toc75218777)

[Branch Modifier – 5A 10](#_Toc75218778)

[Stride Modifier – 5C 10](#_Toc75218779)

[Example Instruction 11](#_Toc75218780)

[Instructions 12](#_Toc75218781)

[Arithmetic / Logical 12](#_Toc75218782)

[ABS – Absolute Value 12](#_Toc75218783)

[ADD - Addition 14](#_Toc75218784)

[AND – Bitwise And 15](#_Toc75218785)

[BMM – Bit Matrix Multiply 16](#_Toc75218786)

[BYTNDX – Byte Index 17](#_Toc75218787)

[CMP – Compare 18](#_Toc75218788)

[CNTPOP – Count Population 20](#_Toc75218789)

[CNTLZ – Count Leading Zeros 21](#_Toc75218790)

[COM – Ones Complement 22](#_Toc75218791)

[DEP – Deposit 23](#_Toc75218792)

[DIF – Difference 24](#_Toc75218793)

[DIV – Division 25](#_Toc75218794)

[DIVR – Division 27](#_Toc75218795)

[DIVSU – Division Signed-Unsigned 28](#_Toc75218796)

[DIVU – Division Unsigned 29](#_Toc75218797)

[EOR – Bitwise Exclusive Or 29](#_Toc75218798)

[EXIn – Extended Immediate 30](#_Toc75218799)

[EXT –Extract Bitfield 31](#_Toc75218800)

[EXTU –Extract Bitfield Unsigned 32](#_Toc75218801)

[FDP – Fused Dot Product 32](#_Toc75218802)

[FFO –Find First One 33](#_Toc75218803)

[MAX – Maximum Value 34](#_Toc75218804)

[MIN – Minimum Value 35](#_Toc75218805)

[MOD – Instruction Modifier 36](#_Toc75218806)

[MUL – Multiply 38](#_Toc75218807)

[MULF – Fast Unsigned Multiply 40](#_Toc75218808)

[MULU – Unsigned Multiply 41](#_Toc75218809)

[MUX – Multiplex 42](#_Toc75218810)

[NABS –Negative Absolute Value 43](#_Toc75218811)

[NEG - Negate 44](#_Toc75218812)

[NOT – Logical Not 45](#_Toc75218813)

[OR – Bitwise Or 46](#_Toc75218814)

[PERM – Permute Bytes 47](#_Toc75218815)

[PTRDIF – Difference Between Pointers 48](#_Toc75218816)

[SEQ – Set if Equal 49](#_Toc75218817)

[SGE – Set if Greater Than or Equal 51](#_Toc75218818)

[SGEU – Set if Greater Than or Equal Unsigned 52](#_Toc75218819)

[SGT – Set if Greater Than 53](#_Toc75218820)

[SGTU – Set if Greater Than Unsigned 55](#_Toc75218821)

[SIGN – Sign (Compare to Zero) 56](#_Toc75218822)

[SLL –Shift Left Logical 57](#_Toc75218823)

[SLLP –Shift Left Logical Pair 58](#_Toc75218824)

[SLT – Set if Less Than 59](#_Toc75218825)

[SLE – Set if Less Than or Equal 61](#_Toc75218826)

[SLEU – Set if Less Than or Equal 62](#_Toc75218827)

[SLTU – Set if Less Than Unsigned 63](#_Toc75218828)

[SNE – Set if Not Equal 64](#_Toc75218829)

[SQRT – Square Root 65](#_Toc75218830)

[SRA –Shift Right Arithmetic Pair 66](#_Toc75218831)

[SRL –Shift Right Logical 67](#_Toc75218832)

[SRLP –Shift Right Logical Pair 68](#_Toc75218833)

[SUB - Subtract 69](#_Toc75218834)

[SUBF – Subtract From 70](#_Toc75218835)

[U21NDX – UTF21 Index 71](#_Toc75218836)

[WYDNDX – Wyde Index 72](#_Toc75218837)

[XOR – Bitwise Exclusive Or 73](#_Toc75218838)

[ZXB –Zero Extend Byte 74](#_Toc75218839)

[ZXW –Zero Extend Wyde 74](#_Toc75218840)

[ZXT –Zero Extend Tetra 75](#_Toc75218841)

[Graphics 76](#_Toc75218842)

[BLEND – Blend Colors 76](#_Toc75218843)

[TRANSFORM – Transform Point 77](#_Toc75218844)

[RW\_COEEF – Read/Write Co-efficient 78](#_Toc75218845)

[Memory Operations 79](#_Toc75218846)

[CACHE – Cache Command 79](#_Toc75218847)

[LDx – Load 80](#_Toc75218848)

[LDB – Load Byte (8 bits) 83](#_Toc75218849)

[LDBZ – Load Byte, Zero Extend (8 bits) 83](#_Toc75218850)

[LDO – Load Octa (64 bits) 84](#_Toc75218851)

[LDT – Load Tetra (32 bits) 85](#_Toc75218852)

[LDTZ – Load Tetra, Zero Extend (32 bits) 85](#_Toc75218853)

[LDW – Load Wyde (16 bits) 86](#_Toc75218854)

[LDWZ – Load Wyde, Zero Extend (16 bits) 86](#_Toc75218855)

[LEA – Load Effective Address 87](#_Toc75218856)

[STx – Store 89](#_Toc75218857)

[STB – Store Byte (8 bits) 92](#_Toc75218858)

[STBZ – Store Byte and Zero (8 bits) 92](#_Toc75218859)

[STO – Store Octa (64 bits) 93](#_Toc75218860)

[STOZ – Store Octa and Zero (64 bits) 93](#_Toc75218861)

[STPTR – Store Pointer (64 bits) 94](#_Toc75218862)

[STT – Store Tetra (32 bits) 95](#_Toc75218863)

[STTZ – Store Tetra and Zero (32 bits) 95](#_Toc75218864)

[STW – Store Wyde (16 bits) 95](#_Toc75218865)

[STWZ – Store Wyde and Zero (16 bits) 95](#_Toc75218866)

[Flow Control (Branch Unit) Operations 97](#_Toc75218867)

[Branches 97](#_Toc75218868)

[BAL – Branch and Link 97](#_Toc75218869)

[BBS – Branch if Bit Set 98](#_Toc75218870)

[BEQ – Branch if Equal 99](#_Toc75218871)

[BGE – Branch if Greater Than or Equal 100](#_Toc75218872)

[BGEU – Branch if Greater Than or Equal Unsigned 101](#_Toc75218873)

[BGT – Branch if Greater Than 102](#_Toc75218874)

[BGTU – Branch if Greater Than Unsigned 103](#_Toc75218875)

[BNE – Branch if Not Equal 104](#_Toc75218876)

[BLE – Branch if Less Than or Equal 105](#_Toc75218877)

[BLEU – Branch if Less Than or Equal Unsigned 106](#_Toc75218878)

[BLT – Branch if Less Than 107](#_Toc75218879)

[BLTU – Branch if Less Than Unsigned 108](#_Toc75218880)

[BRA – Unconditional Branch 109](#_Toc75218881)

[BSR – Unconditional Branch to Subroutine 109](#_Toc75218882)

[CHK – Check Register Against Bounds 110](#_Toc75218883)

[JAL – Jump and Link 112](#_Toc75218884)

[JALR – Jump and Link to Register 113](#_Toc75218885)

[JMP – Jump 114](#_Toc75218886)

[RET – Return from Subroutine 115](#_Toc75218887)

[System Instructions 116](#_Toc75218888)

[BRK – Break 116](#_Toc75218889)

[CSRx – Control and Special / Status Access 117](#_Toc75218890)

[PEEK – Peek at Queue / Stack 118](#_Toc75218891)

[PFI – Poll for Interrupt 119](#_Toc75218892)

[POP – Pop from Queue / Stack 119](#_Toc75218893)

[PUSH – Push on Queue / Stack 120](#_Toc75218894)

[REX – Redirect Exception 121](#_Toc75218895)

[RTE – Return from Exception 123](#_Toc75218896)

[STAT – Get Status of Queue / Stack 124](#_Toc75218897)

[SYNC -Synchronize 125](#_Toc75218898)

[TLBRW – Read / Write TLB 127](#_Toc75218899)

[WFI – Wait for Interrupt 128](#_Toc75218900)

[Vector Specific Instructions 129](#_Toc75218901)

[MFILL –Mask Fill 129](#_Toc75218902)

[MFIRST – Find First Set Bit 129](#_Toc75218903)

[MFM – Move from Mask 131](#_Toc75218904)

[MFVL – Move from Vector Length 131](#_Toc75218905)

[MLAST – Find Last Set Bit 132](#_Toc75218906)

[MTM – Move to Mask 133](#_Toc75218907)

[MTVL – Move to Vector Length 134](#_Toc75218908)

[Arithmetic / Logical 135](#_Toc75218909)

[V2BITS 135](#_Toc75218910)

[VBITS2V 136](#_Toc75218911)

[VCIDX – Compress Index 137](#_Toc75218912)

[VCMPRSS – Compress Vector 138](#_Toc75218913)

[VEINS / VMOVSV – Vector Element Insert 139](#_Toc75218914)

[VEX / VMOVS – Vector Element Extract 140](#_Toc75218915)

[VSCAN 141](#_Toc75218916)

[VSLLV – Shift Vector Left Logical 142](#_Toc75218917)

[VSRLV – Shift Vector Right Logical 143](#_Toc75218918)

[Memory Operations 144](#_Toc75218919)

[CVLDx – Compressed Vector Load 144](#_Toc75218920)

[CVSTx – Compressed Vector Store 146](#_Toc75218921)

[Root Opcode Map 148](#_Toc75218922)

[{R1} Integer Monadic Register Ops – Func10 149](#_Toc75218923)

[{R2} Integer Dyadic Register Ops – Func7 149](#_Toc75218924)

[{R3} Triadic Register Ops 149](#_Toc75218925)

[{F1} Floating-Point Monadic Ops – Funct7 150](#_Toc75218926)

[{F2} Floating-Point Dyadic Ops – Funct7 150](#_Toc75218927)

[{F3} Floating-Point Dyadic Ops – Funct7 150](#_Toc75218928)

[{VM} Vector Mask Register Ops 151](#_Toc75218929)

[{OSR2} System Ops 151](#_Toc75218930)

# Instruction Formats

ANY1 has relatively few instruction formats. The instruction format is a fixed 36-bits in size. It is highly desirable to keep the instruction size to a minimum as minimally sized instructions have better entropy characteristics. The instruction format contains more decode information than is present in some instruction sets. Particularly there are register type codes associated with register spec fields. This is to keep the size of the instruction decoder hardware to a minimum. Otherwise, a ginormous decoder would be required to handle all possible combinations of instructions and types of registers. A vector machine that supports multiple primitive data types leads to a design that potentially has a lot of variation of instructions.

## Register Specifiers

The seven-bit register specifier field of an instruction looks like:

|  |  |
| --- | --- |
| 2625 | 24 20 |
| Tb2 | Rb5 |

Register specifiers are always located at the same fixed positions in all instructions. This increases performance and minimizes decoding hardware.

Register specifiers contain a one or two-bit type code and a five-bit register number. The meaning of the type code is in the following table:

|  |  |
| --- | --- |
| Ty2 | Meaning |
| 0 | Scalar register |
| 1 | Vector register |
| 2,3 | Six-bit constant value (bit 0 of Ty is the high order bit of the constant)  Not available for Ra, Rt register specs |

Note that allowing either scalar or vector registers to be specified in the register spec eliminates the need for special classes of instructions to handle scalar-scalar, vector-scalar, or vector-vector operations.

For signed operations the six-bit constant is treated as a signed value and extended to 64-bits. For unsigned operations (BLTU, BGEU, SLTU, SGEU,…) the six-bit constant is treated as an unsigned value and zero extended to 64-bits.

### Constant Interpretation for Float Instructions

For floating point instructions specifying a constant treats the constant as a positive six-bit floating point constant which is extended to 64-bits before use. The exponent specifies a three-bit range of -3 to +4.

|  |  |
| --- | --- |
| Bits 3 to 4 | Bits 0 to 2 |
| 3-bit Exponent | 3-bit significand |

The significand has a hidden leading one bit.

## Vector Instruction Indicator

The processing core needs to know if an instruction is a vector instruction before it is fully decoded. Depending on if the instruction is a vector instruction, it may be re-decoded and sent into the pipeline multiple times. The processor needs to know very quickly and simply at the instruction fetch stage if the instruction is a vector operation. So, to help things along ANY1 encodes this information in bit 7 of all instructions. See the sample instruction below.

Immediate Format:

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
|  |  |  |  |  | ▼ |  |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 09h8 |

## Root Opcode

The root opcode determines the class of instructions executed. Some commonly executed instructions are also encoded at the root level to make more bits available for the instruction. The root opcode is always present in all instructions as the lowest seven bits of the instruction.

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
|  |  |  |  |  |  | ▼ |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 09h8 |

## Target Register Spec

Most instructions have a target register. The register spec for the target register is always in the same position, bits 8 to 13 of an instruction. The Tt field specifies the target register is a scalar (0) or vector (1) register.

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
|  |  |  | ▼ | ▼ |  |  |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 09h8 |

## Extended Immediate

The extended immediate instructions extend an immediate constant from bit 11 of the following instruction. Five root opcodes are reserved for extended immediates. See the [EXIn](#_EXIn_–_Extended) description.

|  |  |
| --- | --- |
| 35 8 | 7 0 |
| Constant38..11 | 508 |

|  |  |
| --- | --- |
| 35 8 | 7 0 |
| Constant66..39 | 518 |

|  |  |
| --- | --- |
| 35 8 | 7 0 |
| Constant94..67 | 528 |

|  |  |
| --- | --- |
| 35 8 | 7 0 |
| Constant122..95 | 538 |

|  |  |
| --- | --- |
| 35 8 | 7 0 |
| Constant150..123 | 548 |

## Register Formats

### R1 (one source register)

With just one source register spec there is room available in the instruction to encode the vector mask register for vector instructions. This avoids the needs for an instruction modifier.

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 24 | 23 21 | 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| func7 | ~5 | m3 | z | Ta | Ra5 | Tt | Rt5 | v | 01h7 |

### R2 (two source register)

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| func7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

### Branch Instructions

Branch instructions make use of bits 8 to 13 and 27 to 35 to specify a 15-bit branch displacement for a range of ±71kB. The displacement is in terms of count of instructions skipped over. Note there are no vector branch instructions. Opcodes that would encode to vector branching are reserved for future use.

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 35 27 | 2625 | 24 20 | 19 | 18 14 | 13 8 | 7 | 6 0 |
| Const9 | Tb2 | Rb5 | Ta | Ra5 | Const6 | 0 | 4xh8 |

## Instruction Modifiers

### IMOD Instruction Modifier - 58

This modifier adds two register spec fields allowing an instruction to use up to four source registers. It also allows a vector mask register to be optionally specified. Rounding mode for instructions supporting rounding is also possible to specify.

|  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 31 | 30 28 | 27 | 2625 | 24 20 | 19 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| ~5 | Rm3 | Tc | Td2 | Rd6 | Tc | Rc5 | A | m3 | z | 58h8 |

A: 00 = ignore mask and round

01 = apply vector mask

10 = apply rounding

11 = apply both vector mask and rounding.

### Branch Modifier – 5A

The branch modifier adds a link register to allow storing a return address. This allows conditional subroutine calls. Also, present is a branch target register spec. This allows conditional branching relative to a value in a register. Fifteen additional bits of branch displacement are provided, making the total branch displacement 30 bits (or ±2.4GB).

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 21 | 2019 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant15 | Tc2 | Rc5 | Tt | Rt5 | 0 | 5Ah7 |

Instruction Modifier

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 29 | 28 26 | 25 14 | 1312 | 11 9 | 8 | 7 0 |
| C3 | Rm3 | Constant12 | A | m3 | z | 5Bh8 |

### Stride Modifier – 5C

The stride modifier is used with vector load / store instructions to specify the stride of the operation.

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 21 | 2019 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| Const15 | Tc2 | Rc5 | A | m3 | z | 5Ch8 |

z: 1 = zero vector element if mask bit clear, 0 = vector element unchanged (ignored for scalar ops)

m3: vector mask register (ignored for scalar operations).

Rm3: rounding mode

|  |  |  |  |
| --- | --- | --- | --- |
| Sz4 | Size | Qualifier | Alt Qualifier |
| 0 | byte | .b |  |
| 1 | wyde | .w |  |
| 2 | tetra | .t | .s (single) |
| 3 | octa | .o | .d (double) |
| 4 | hexi | .h | .q (quad) |
| 8 | SIMD byte | .bp |  |
| 9 | SIMD wyde | .wp |  |
| 10 | SIMD tetra | .tp | .sp |
| 11 | SIMD octa | .op | .dp |
| 12 | SIMD hexi | .hp | .qp |

## Example Instruction

add.int.o x1,x2,x3 ; scalar add of integers x2,x3

add.int.o v1,v2,v3,vm0 ; vector add of integers v2,v3

add.int.o v1,v2,x4,vm0 ; vector add scalar integers v2,x4

add.fp.o v1,v2,v3,vm0 ; vector add float-point double v2,v3

# Instructions

## Arithmetic / Logical

### ABS – Absolute Value

**Description:**

This instruction takes the absolute value of a register and places the result in a target register.

**Integer Instruction Format: R1**

Both the source and target registers are treated as integer values.

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 24 | 23 21 | 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 06h7 | ~5 | m3 | z | Ta | Ra5 | Tt | Rt5 | v | 01h7 |

**v: 0 = scalar, 1 = vector op**

**Float Instruction Format: R1**

Both the source and target registers are treated as float values.

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 24 | 23 21 | 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 20h7 | ~5 | m3 | z | Ta | Ra5 | Tt | Rt5 | v | 34h7 |

**Decimal Float Instruction Format: R1**

Both the source and target registers are treated as decimal float values.

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 24 | 23 21 | 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 20h7 | ~5 | m3 | z | Ta | Ra5 | Tt | Rt5 | v | 30h7 |

**Operation:**

If Ra < 0

Rt = -Ra

else

Rt = Ra

**Vector Operation**

for x = 0 to VL - 1

if (Vm[x]) Rt[x] = Ra[x] < 0 ? -Ra[x] : Ra[x]

**Execution Units:** I, F, D, P

**Clock Cycles: 1**

**Exceptions:** none

**Notes:**

For sign-magnitude formats this instruction simply clears the MSB of the number. No rounding occurs.

### ADD - Addition

**Description:**

Add two values. The first operand must be in a register. The second operand may be in a register or may be an immediate value specified in the instruction.

**Operation:**

Rt = Ra + Imm

or

Rt = Ra + Rb

**Vector Operation**

for x = 0 to VL - 1

if (Vm[x]) Vt[x] = Va[x] + Vb[x]

else if (z) Vt[x] = 0

**Integer Instruction Format: RI**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 04h7 |

1 clock cycle / N clock cycles (N = vector length)

**Integer Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 04h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

1 clock cycle / N clock cycles (N = vector length)

**Float Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 04h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 35h7 |

25 clock cycles / N \* 25 clock cycles (N = vector length)

**Decimal Float Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 04h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 31h7 |

25 clock cycles / N \* 25 clock cycles (N = vector length)

**Posit Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 04h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 39h7 |

25 clock cycles / N \* 25 clock cycles (N = vector length)

**Vector Mask Instruction Format: R2 (MADD)**

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 25 | 24 22 | 21 18 | 17 15 | 14 11 | 10 8 | 7 0 |
| 04h7 | 04 | Vmb3 | 04 | Vma3 | 04 | Vmt3 | 3Eh8 |

1 clock cycle

**Exceptions:** none

### AND – Bitwise And

**Description**:

Perform a bitwise ‘and’ operation between operands. The first operand must be in a register. The second operand may be in a register or may be an immediate value specified in the instruction. The immediate constant is one extended before use.

**Integer Instruction Format: RI**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 08h7 |

1 clock cycle / N clock cycles (N = vector length)

**Integer Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 08h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

1 clock cycle / N clock cycles (N = vector length)

**Vector Mask Instruction Format: R2 (MADD)**

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 25 | 24 22 | 21 18 | 17 15 | 14 11 | 10 8 | 7 0 |
| 00h7 | 04 | Vmb3 | 04 | Vma3 | 04 | Vmt3 | 3Eh8 |

1 clock cycle

**Operation:**

Rt = Ra & Imm

or

Rt = Ra & Rb

**Vector Operation**

for x = 0 to VL - 1

if (Vm[x]) Vt[x] = Va[x] & Vb[x]

else if (z) Vt[x] = 0

**Exceptions**: none

### BMM – Bit Matrix Multiply

BMM Rt, Ra, Rb

**Description**:

The BMM instruction treats the bits of register Ra and register Rb as an 8x8 matrix and performs a bit matrix multiply of the two registers and stores the result in the target register. An alternate mnemonic for this instruction is MOR.

**Instruction Format**: R2

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| func7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

|  |  |
| --- | --- |
| Fn7 | Function |
| 30h | MOR |
| 31h | MXOR |
| 32h | MORT (MOR transpose) |
| 33h | MXORT (MXOR transpose) |

**Operation**:

for I = 0 to 7

for j = 0 to 7

Rt.bit[i][j] = (Ra[i][0]&Rb[0][j]) | (Ra[i][1]&Rb[1][j]) | … | (Ra[i][15]&Rb[15][j])

**Clock Cycles:** 1

**Execution Units: Integer** ALU

**Exceptions**: none

**Notes**:

The bits are numbered with bit 63 of a register representing I,j = 0,0 and bit 0 of the register representing I,j = 7,7.

### BYTNDX – Byte Index

**Description:**

This instruction searches Ra, which is treated as an array of eight bytes, for a byte value specified by Rb or an immediate value and places the index of the byte into the target register Rt. If the byte is not found -1 is placed in the target register. A common use would be to search for a null byte. The index result may vary from -1 to +7. The index of the first found byte is returned (closest to zero).

If a vector BYTNDX instruction is issued and the target is a scalar register then the instruction searches all the vector elements and returns a value which varies from -1 to +511 in the scalar register. Thus, BYTNDX may be used to determine the length of a null termination string in the vector register.

**Instruction Format:** R2

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 32 | 3127 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 04 | ~5 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 1Ah7 |

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 35 32 | 31 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 14 | Constant12 | Ta | Ra5 | Tt | Rt5 | v | 1Ah7 |

**R2 Supported Formats**: .o

**Clock Cycles:** 1

**Execution Units:** Integer ALU

**Operation:**

Rt = Index of (Rb in Ra)

**Exceptions:** none

### CMP – Compare

**Description**

Compare two registers or a register and an immediate value and return the relationship between them.

**Integer Instruction Format: R2**

Both values are treated as signed numbers.

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 20h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

**1 clock cycle**

**Operation:**

Rt = Ra < Rb ? –1 : Ra = Rb ? 0 : 1

**Vector Operation**

for x = 0 to VL - 1

if (Vm[x]) Vt[x] = Va[x] < Vb[x] ? –1 : Va[x]=Vb[x] ? 0 : 1

**Float Instruction Format: R2 (FCMP)**

Both values are treated as double precision (64-bit) floating point numbers. The result is returned as a float value of -1.0, 0.0 or +1.0. If the comparison is unordered 2.0 is returned.

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 10h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 35h7 |

**1 clock cycle**

**Float Instruction Format: R2 (FCMPB)**

Both values are treated as double precision (64-bit) floating point numbers. The value returned is a bit vector as outlined in the table below. Note that the less than status is returned in both bits 1 and 63 so that a BLT may be used.

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 15h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 35h7 |

**1 clock cycle**

**The float comparison returns a bit vector containing the status of all possible relationships. This may then be tested with the BBS instruction.**

|  |  |
| --- | --- |
| Rt bit | Meaning |
| 0 | = equal |
| 1 | < less than |
| 2 | <= less than or equal |
| 3 | < magnitude less than |
| 4 | unordered |
| 5 to 7 | zero (reserved) |
| 8 | < > not equal |
| 9 | >= greater than or equal |
| 10 | > greater than |
| 11 | >= magnitude greater than or equal |
| 12 | ordered |
| 13 to 62 | zero (reserved) |
| 63 | less than |

**Decimal Float Instruction Format: R2 (DFCMP)**

Both values are treated as double precision (64-bit) decimal floating point numbers. The result is returned as a float value of -1.0, 0.0 or +1.0. If the comparison is unordered 2.0 is returned.

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 10h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 31h7 |

**1 clock cycle**

### CNTPOP – Count Population

CNTPOP r1,r2

CNTPOP v1,v2

CNTPOP r1,vm2

**Description:**

Count the number of ones and place the count in the target register.

**Vector Operation**

for x = 0 to VL - 1

if (Vm[x]) Vt[x] = popcnt(Va[x])

**Instruction Format: R1**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 24 | 23 21 | 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 02h7 | ~5 | m3 | z | Ta | Ra5 | Tt | Rt5 | v | 01h7 |

**Vector Mask Instruction Format: R1**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 26 | 25 22 | 21 18 | 17 15 | 1413 | 12 8 | 7 0 |
| 0Dh10 | ~4 | 05 | Vm3 | Tt2 | Rt5 | 3Eh8 |

**Execution Units: integer** ALU

**Exceptions:** none

### CNTLZ – Count Leading Zeros

**Description**:

Count the number of leading zeros (starting at the MSB) in Ra and place the count in the target register.

**Instruction Format: R1**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 24 | 23 21 | 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 00h7 | ~5 | m3 | z | Ta | Ra5 | Tt | Rt5 | v | 01h7 |

**Vector Mask Instruction Format: R1**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 26 | 25 22 | 21 18 | 17 15 | 1413 | 12 8 | 7 0 |
| 00h10 | ~4 | 05 | Vm3 | Tt2 | Rt5 | 3Eh8 |

**R1 Supported Formats**: .o

**Clock Cycles**: 1

**Execution Units:** Integer ALU

**Exceptions:** none

### COM – Ones Complement

**Description:**

Bitwise complement all the bits in the register. 1’s become 0’s and 0’s become 1’s.

**Instruction Format: R1**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 24 | 23 21 | 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 03h7 | ~5 | m3 | z | Ta | Ra5 | Tt | Rt5 | v | 01h7 |

1 clock cycle

**Operation**

**Rt = ~Ra**

**Vector Operation**

for x = 0 to VL-1

if (Vm[x]) Vt[x] = ~Va[x]

else if (z) Vt[x] = 0

else Vt[x] = Vt[x]

**Exceptions**: none

### DEP – Deposit

**Description**:

Insert to a bitfield. Rc specifies the bitfield offset, Rd specifies the width of the bitfield. Rb specifies the data to insert. Ra contains the original source data. The least significant Rd minus one bits of Rb are inserted into Ra at the position specified by Rc. The final result is placed into Rt.

This instruction may also be used to perform a left shift of a single register by specifying x0 for Ra.

**Formats Supported**: R4

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 31 29 | 28 26 | 25 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| DT3 | Rm3 | Rc6 | Rd6 | A | m3 | z | 59h8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 36 | Rb6 | Ra6 | Rt6 | 1Ch8 |

|  |  |
| --- | --- |
| DT3 | Meaning |
| 00 | Rc,Rd are both regs |
| 01 | Rc is a six bit immediate, Rd is a reg |
| 10 | Rd is a six bit immediate, Rc is a reg |
| 11 | Both Rc, Rd are six bit immediates |

**Operation Size:** .o

**Execution Units**: integer ALU

**Exceptions**: none

**Example**:

### DIF – Difference

**Description:**

This instruction computes the difference between two signed values in registers Ra and Rb and places the result in a target Rt register. The difference is calculated as the absolute value of Ra minus Rb.

**Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 18h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

**Supported Formats**: .o

**Clock Cycles:** 1

**Execution Units:** Integer

**Operation:**

Rt = Abs(Ra - Rb)

**Exceptions**: none

### DIV[O][Z] – Division

**Description**:

Divide two operand values and place the result in the target register. The first operand must be in a register specified by the Ra field of the instruction. The second operand may be a register specified by the Rb field of the instruction or an immediate value. Both operands are treated as signed values. The register form of this instruction may cause a divide by zero exception if enabled in the instruction.

**Instruction Format: RI**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 10h7 |

**Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 10h7 | OZ2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

**Float Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 09h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 35h7 |

**Execution Units**: ALU

**Clock Cycles**: 20

**Exceptions**: none

### DIVR – Division

**Description**:

This instruction is supplied as division is not commutative. Divide two operand values and place the result in the target register. The first operand must be an immediate value. The second operand must be a register specified by the Ra field of the instruction. Both operands are treated as signed values. This instruction allows a constant to be divided by a register value “reverse” to how the DIV instruction works.

**Integer Instruction Format: RI**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 21h7 |

**Execution Units**: ALU

**Clock Cycles**: 20

**Exceptions**: none

### DIVSU – Division Signed-Unsigned

**Description**:

Divide two operand values and place the result in the target register. The first operand must be in a register specified by the Ra field of the instruction. The second operand may be either a register specified by the Rb field of the instruction, an immediate value. The first operand is treated as a signed value, the second operand as unsigned.

**Instruction Format: RI**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 12h7 |

**Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 12h7 | OZ2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

**Execution Units**: ALU

**Clock Cycles**: 20

**Exceptions**: none

### DIVU – Division Unsigned

**Description**:

Divide two operand values and place the result in the target register. The first operand must be in a register specified by the Ra field of the instruction. The second operand may be either a register specified by the Rb field of the instruction, an immediate value. Both operands are treated as unsigned values.

**Instruction Format: RI**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 11h7 |

**Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 11h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

**Execution Units**: ALU

**Clock Cycles**: 20

**Exceptions**: none

### EOR – Bitwise Exclusive Or

**Description:**

This is an alternate mnemonic for the [XOR](#_XOR_–_Bitwise) instruction. Perform a bitwise exclusive or operation between operands. The first operand must be in a register. The second operand may be a register or immediate value. The immediate constant is zero extended before use.

### EXIn – Extended Immediate

**Description:**

These instructions are used to extend the constant field of the following instruction. The constant is extended from bit eleven. Multiple constant extensions may be present to extend a constant up to 64 bits. When multiple extensions are present they should be placed in order least significant to most significant. (EXI0 first, EXI1 second, EXI2 third). The constant extensions sign-extend to the width of the machine.

Constant extensions may be applied for most instructions with a constant field.

Interrupts are locked out between the modifier and the following instruction.

**Instruction Format: EXI**

|  |  |
| --- | --- |
| 35 8 | 7 0 |
| Constant38..11 | 508 |

|  |  |
| --- | --- |
| 35 8 | 7 0 |
| Constant66..39 | 518 |

|  |  |
| --- | --- |
| 35 8 | 7 0 |
| Constant94..67 | 528 |

|  |  |
| --- | --- |
| 35 8 | 7 0 |
| Constant122..95 | 538 |

|  |  |
| --- | --- |
| 35 8 | 7 0 |
| Constant150..123 | 548 |

### EXT –Extract Bitfield

**Description**:

A bitfield is extracted from the source by shifting the source to the right and ‘and’ masking. The result is sign extended to the width of the machine. This instruction may be used to sign extend a value from an arbitrary bit position. The width specified should be one less than the desired width. The source is value is contained in the register pair Ra, Rb. The field width is specified by Rc and field offset by Rd.

**Instruction Format**: R4

|  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 31 | 30 28 | 27 | 2625 | 24 20 | 19 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| ~5 | Rm3 | Tc | Td2 | Rd6 | Tc | Rc5 | A | m3 | z | 58h8 |

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 04h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 1Ch7 |

**Execution Units:** Integer ALU

**Exceptions**: none

**Notes**:

### EXTU –Extract Bitfield Unsigned

**Description**:

A bitfield is extracted from the source by shifting the source to the right and ‘and’ masking. The result is zero extended to the width of the machine. This instruction may be used to zero extend a value from an arbitrary bit position. The width specified should be one less than the desired width. The source is a 128-bit value which is the concatenation of Rb and Ra. Rc contains the field offset, Rd the width.

**Instruction Format**: R4

|  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 31 | 30 28 | 27 | 2625 | 24 20 | 19 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| ~5 | Rm3 | Tc | Td2 | Rd6 | Tc | Rc5 | A | m3 | z | 58h8 |

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 05h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 1Ch7 |

**Execution Units:** Integer ALU

**Exceptions**: none

**Notes**:

### FDP – Fused Dot Product

**Description**:

Calculate the dot product x = (a \* b) + (c \* d). The operations are fused together meaning no rounding occurs until the final product is produced.

**Instruction Format**: R4

|  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 31 | 30 28 | 27 | 2625 | 24 20 | 19 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| ~5 | Rm3 | Tc | Td2 | Rd6 | Tc | Rc5 | A | m3 | z | 58h8 |

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 37h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 03h7 |

### FFO –Find First One

**Description**:

A bitfield contained in Ra is searched beginning at the most significant bit to the least significant bit for a bit that is set. The index into the bitfield of the bit that is set is stored in Rt. If no bits are set, then Rt is set equal to -1. The field offset is specified by Rc, the field width by Rd.

**Instruction Format**: R4

|  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 31 | 30 28 | 27 | 2625 | 24 20 | 19 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| ~5 | Rm3 | Tc | Td2 | Rd6 | Tc | Rc5 | A | m3 | z | 58h8 |

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 06h7 | ~2 | ~2 | ~5 | Ta | Ra5 | Tt | Rt5 | v | 1Ch7 |

|  |  |
| --- | --- |
| DT3 | Meaning |
| 00 | Rc,Rd are both regs |
| 01 | Rc is a six bit immediate, Rd is a reg |
| 10 | Rd is a six bit immediate, Rc is a reg |
| 11 | Both Rc, Rd are six bit immediates |

**Clock Cycles**:

**Execution Units:** Integer

**Exceptions**: none

### MAX – Maximum Value

**Description:**

Determines the maximum of two values in registers Ra, Rb and places the result in the target register Rt.

**Integer Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 29h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

**Float Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 03h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 35h7 |

**Operation:**

IF Ra > Rb

Rt = Ra

else

Rt = Rb

### MIN – Minimum Value

**Description:**

Determines the minimum of two values in registers Ra, Rb and places the result in the target register Rt.

**Integer Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 28h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

**Float Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 02h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 35h7 |

**Operation:**

IF Ra < Rb

Rt = Ra

else

Rt = Rb

### MOD – Instruction Modifier

**Description:**

Used to modify the operation of the following instruction. Modifiers 50h to 52h are used to supply additional constant bits and are described as EXI instructions.

Interrupts are locked out between the modifier and the following instruction.

**Instruction Format: 58/D8 (IMOD)**

|  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 31 | 30 28 | 27 | 2625 | 24 20 | 19 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| ~5 | Rm3 | Tc | Td2 | Rd6 | Tc | Rc5 | A | m3 | z | 58h8 |

A[0]: 1 = apply vector mask, 0=ignore mask spec

A[1]: 1 = apply rounding mode. 0 = ignored rounding mode spec

There are three basic additional elements supplied for the following instruction.

1. A vector mask specification, used only by vector instructions.
2. Two additional source registers
3. A rounding mode specification, useful only to applicable instructions

Two additional register fields allow up to four source operands for the following instruction. If these registers are not required they should be specified as #0.

Application of the vector mask and rounding mode are optional. Two bits in the ‘A’ field indicate which of these modifiers is applied.

**Instruction Format: 5A (BRMOD)**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 21 | 2019 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant15 | Tc2 | Rc5 | Tt | Rt5 | 0 | 5Ah7 |

The 5A modifier applies to branch instructions to both extend the range of a branch and allow branch-to-register, and branch-and-link capability. When the 5A modifier is present, the Rc register overrides the use of the IP in calculating the branch target address. The target address is then the sum of register Rc and a constant supplied in the instruction.

The constant field of the 5A modifier adds an additional fifteen bits to the branch displacement. This allows branching extended to ±2.4GB.

The Rt field may be set to the address of the instruction following the branch, to allow conditional branch to subroutine capability.

**Instruction Format: 5C/DC (STRIDE)**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 21 | 2019 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| Const15 | Tc2 | Rc5 | A | m3 | z | 5Ch8 |

This format is used with vector load and store instructions to supply stride information and extend the address range of the load / store. Any additional constant modifiers (EXI0, EXI1, EXI2) should be placed before the stride modifier.

### MUL[O] – Multiply

**Description**:

Multiply two values. The first operand must be in a register. The second operand may be in a register or may be an immediate value specified in the instruction. Both the operands are treated as signed values, the result is a signed result. The register form of the instruction may cause an overflow exception if the overflow enable bit in the instruction is set.

**Integer Instruction Format: RI**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 06h7 |

4 clock cycles

**Integer Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 06h7 | O2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

4 clock cycles

**Exceptions**: overflow (if enabled)

**Float Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 08h7 | O2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 35h7 |

25 clock cycles

**Execution Units**: ALU

**Vector Operation**

for x = 0 to VL - 1

if (Vm[x]) Vt[x] = Va[x] \* Vb[x]

**Exceptions**: none

### MULF – Fast Unsigned Multiply

**Description**:

Multiply two values. The first operand must be in a register. The second operand may be in a register or may be an immediate value specified in the instruction. Both the operands are treated as unsigned values. The result is an unsigned result. The fast multiply multiplies only the low order 24 bits of the first operand times the low order 16 bits of the second. The result is a 40-bit unsigned product.

**Integer Instruction Format: RI**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 15h7 |

1 clock cycle / N clock cycles (N = vector length)

**Integer Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 1Ch7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

1 clock cycle / N clock cycles (N = vector length)

**Execution Units**: ALU

**Clock Cycles:** 1

**Exceptions**: none

### MULU – Unsigned Multiply

**Description**:

Multiply two values. The first operand must be in a register. The second operand may be in a register or may be an immediate value specified in the instruction. Both the operands are treated as unsigned values, the result is a unsigned result.

**Integer Instruction Format: RI**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 0Eh7 |

4 clock cycles

**Integer Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 0Eh7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

4 clock cycles

**Exceptions**: none

**Vector Operation**

for x = 0 to VL - 1

if (Vm[x]) Vt[x] = Va[x] \* Vb[x]

**Exceptions**: none

### MUX – Multiplex

**Description**:

The MUX instruction performs a bit-by-bit copy of a bit of Rb to the target register if the corresponding bit in Ra is set, or a copy of a bit from Rc if the corresponding bit in Ra is clear.

**Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 31 | 30 28 | 27 | 2625 | 24 20 | 19 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| ~5 | ~3 | Tc | ~2 | ~6 | Tc | Rc5 | A | m3 | z | 58h8 |

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 04h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 03h7 |

**Exceptions**: none

**Execution Units: integer** ALU

### NABS –Negative Absolute Value

**Description:**

Take the negative absolute value of the number in register Ra and place the result into target register Rt. No rounding of the number occurs.

**Integer Instruction Format: R1**

Both the source and target registers are treated as integer values.

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 76 | ~6 | Ra6 | Rt6 | 01h8 |

**Integer Vector Format: R1**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 26 | 2524 | 23 21 | 20 | 19 14 | 13 8 | 7 0 |
| 07h6 | ~2 | m3 | z | Va6 | Vt6 | 81h8 |

**Float Instruction Format: R1**

Both the source and target registers are treated as float values.

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 21h6 | ~6 | Ra6 | Rt6 | 34h8 |

**Float Vector Format: R1**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 26 | 2524 | 23 21 | 20 | 19 14 | 13 8 | 7 0 |
| 21h6 | ~2 | m3 | z | Va6 | Vt6 | B4h8 |

**Operation:**

If Ra < 0

Rt = Ra

else

Rt = -Ra

**Clock Cycles: 1**

**Execution Units: Integer,** Floating Point

### NEG - Negate

**Description:**

This is an alternate mnemonic for the SUBF instruction where the constant is zero.

**Instruction Format**: R2

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| 012 | Ra6 | Rt6 | 05h8 |

**Vector Instruction Format**: R2

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| 012 | Va6 | Vt6 | 85h8 |

**Scalar Operation**

Rt = 0 - Rb

**Vector Operation**

for x = 0 to VL - 1

if (Vm[x]) Vt[x] = 0 - Vb[x]

else if (z) Vt[x] = 0

else Vt[x] = Vt[x]

**Notes**

For sign-magnitude operations the sign bit is inverted, no subtract occurs. The result is not rounded.

### NOT – Logical Not

**Description:**

This instruction takes the logical ‘not’ value of a register and places the result in a target register. If the source register contains a non-zero value, then a zero is loaded into the target. Otherwise, if the source register contains a zero a one is loaded into the target register.

NOT reduces the value to a single bit Boolean.

**Integer Instruction Format**: R1

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 24 | 23 21 | 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 04h7 | ~5 | m3 | z | Ta | Ra5 | Tt | Rt5 | v | 01h7 |

1 clock cycle

**Operation:**

Rt = !Ra

**Exceptions**: none

### OR – Bitwise Or

**Description**:

Perform a bitwise or operation between operands. The immediate constant is zero extended before use.

**Integer Instruction Format: RI**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 09h7 |

1 clock cycle / N clock cycles (N = vector length)

**Integer Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 09h7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

1 clock cycle / N clock cycles (N = vector length)

**Vector Mask Instruction Format: R2 (MADD)**

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 25 | 24 22 | 21 18 | 17 15 | 14 11 | 10 8 | 7 0 |
| 01h7 | 04 | Vmb3 | 04 | Vma3 | 04 | Vmt3 | 3Eh8 |

1 clock cycle

**Operation**

**Rt = Ra | Immediate**

**OR**

**Rt = Ra | Rb**

**Vector Operation**

for x = 0 to VL-1

if (Vm[x]) Vt[x] = Va[x] | Vb[x] | Vc[x]

**Exceptions**: none

### PERM – Permute Bytes

**Description**:

This instruction allows any combination of bytes in a source register to be copied to a target register. The low order twenty-four bits of register Rb or a 12-bit immediate constant are used to identify which source bytes are copied to the destination. The twenty-four-bit value is composed of eight three-bit fields. Field S0 indicates the source byte for target byte position 0. S1 indicates the source byte for target byte position 1. S2 to S7 work similarly for the remaining target bytes. There are many interesting possibilities with this instruction. A single source byte could be copied to all target byte positions for instance. Or the order of bytes in a word could be reversed.

**Integer Instruction Format: RI**

The immediate format is normally used with a constant extension word as 24 bits are required to resolve the target positions.

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Ra6 | Rt6 | 17h8 |

1 clock cycle

**Integer Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 17h6 | Rb6 | Ra6 | Rt6 | 02h8 |

1 clock cycle

**Execution Units**: integer ALU

**Clock Cycles**: 1

**Exceptions**: none

### PTRDIF – Difference Between Pointers

**Description**:

Subtract two values then shift the result right. Both operands must be in a register. The right shift is provided to accommodate common object sizes. It may still be necessary to perform a divide operation after the PTRDIF to obtain an index into odd sized or large objects. Rc may vary from zero to thirty-one. This instruction always uses a modifier to supply Rc or an immediate constant.

**Integer Instruction Format: R3**

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 31 29 | 28 26 | 25 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| DT3 | 03 | 06 | Rc6 | A | m3 | z | 59h8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 18h6 | Rb6 | Ra6 | Rt6 | 03h8 |

1 clock cycle

**Integer Vector Instruction Format: R3**

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 31 29 | 28 26 | 25 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| DT3 | 03 | 06 | Vc6 | A | m3 | z | B9h8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 18h6 | Vb6 | Va6 | Vt6 | 83h8 |

1 clock cycle

**Operation**:

Rt = Abs(Ra – Rb) >> Rc

**Clock Cycles**: 1

**Execution Units: Integer**

**Exceptions**:

None

### SEQ – Set if Equal

**Description:**

The set instruction places a 1 or 0 in the target register based on the relationship between the two source operands. If operand Ra is equal to a second operand in register (Rb) or an immediate constant then the target register is set to a one, otherwise the target register is set to a zero. Comparing float values returns an integer.

For floating-point operations positive and negative zero are considered equal.

If a vector operation is taking place then the target register is one of the vector mask registers.

**Instruction Format: RI**

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Ra6 | Rt6 | 26h8 |

**Instruction Format: R2**

**Integer:**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 26h6 | Rb6 | Ra6 | Rt6 | 02h8 |

**Float:**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 11h6 | Rb6 | Ra6 | Rt6 | 35h8 |

**Vector Instruction Format: RI**

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Va6 | Vt6 | A6h8 |

**Vector Instruction Format: R2**

**Integer:**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 26h6 | Vb6 | Va6 | Vt6 | 82h8 |

**Float:**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 11h6 | Rb6 | Ra6 | Rt6 | B5h8 |

### SGE – Set if Greater Than or Equal

**Description:**

The set instruction places a 1 or 0 in the target register based on the relationship between the two source operands. If operand Ra is greater than or equal to a second operand in register (Rb) then the target register is set to a one, otherwise the target register is set to a zero. The operands are treated as signed values.

There is no immediate form to this instruction. An immediate equivalent may be achieved using the [SGT](#_SGT_–_Set) instruction and adjusting the constant by one.

**Instruction Format: R2**

**Integer:**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Dh6 | Rb6 | Ra6 | Rt6 | 02h8 |

**Float:**

The float version is an alternate mnemonic for [SLE](#_SLE_–_Set) where the operands have been swapped.

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 13h6 | Ra6 | Rb6 | Rt6 | 35h8 |

### SGEU – Set if Greater Than or Equal Unsigned

**Description:**

The set instruction places a 1 or 0 in the target register based on the relationship between the two source operands. If operand Ra is greater than or equal to a second operand in register (Rb) then the target register is set to a one, otherwise the target register is set to a zero. The operands are treated as signed values.

There is no immediate form to this instruction. An immediate equivalent may be achieved using the SGTU instruction and adjusting the constant by one.

**Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Fh6 | Rb6 | Ra6 | Rt6 | 02h8 |

### SGT – Set if Greater Than

**Description:**

The set instruction places a 1 or 0 in the target register based on the relationship between the two source operands. If operand Ra is greater than a second operand which is a constant supplied in the instruction, then the target register is set to a one, otherwise the target register is set to a zero. The operands are treated as signed values.

There is no register form of this instruction. The register equivalent operation may be performed using the [SLT](#_SLT_–_Set) instruction and swapping the registers.

**Instruction Format: RI**

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Ra6 | Rt6 | 29h8 |

**Integer Instruction Format: R2 (SLT)**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Ch6 | Ra6 | Rb6 | Rt6 | 02h8 |

**Float Instruction Format: R2 (SLT)**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 12h6 | Ra6 | Rb6 | Rt6 | 35h8 |

**Vector Instruction Format: RI**

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Va6 | Vt6 | A9h8 |

**Integer Vector Instruction Format: R2 (SLT)**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Ch6 | Va6 | Vb6 | Vt6 | 82h8 |

**Float Vector Instruction Format: R2 (SLT)**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 12h6 | Va6 | Vb6 | Vt6 | B5h8 |

### SGTU – Set if Greater Than Unsigned

**Description:**

The set instruction places a 1 or 0 in the target register based on the relationship between the two source operands. If operand Ra is greater than a second operand which is a constant supplied in the instruction, then the target register is set to a one, otherwise the target register is set to a zero. The operands are treated as signed values.

There is no register form of this instruction. The register equivalent operation may be performed using the [SLTU](#_SLTU_–_Set) instruction and swapping the registers.

**Instruction Format: RI**

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Ra6 | Rt6 | 2Bh8 |

**Integer Instruction Format: R2 (SLTU)**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Eh6 | Ra6 | Rb6 | Rt6 | 02h8 |

### SIGN – Sign (Compare to Zero)

**Synopsis**

Take sign of value. This is an extended Mnemonic for the [CMP](#_CMP_–_Compare) instruction.

**Description**

The sign of a register is placed in the target register Rt.

**Instruction Format: R1**

**Integer:**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Ah6 | 06 | Ra6 | Rt6 | 02h8 |

**Float:**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 10h6 | 06 | Ra6 | Rt6 | 35h8 |

**Operation:**

Rt = Ra < 0 ? –1 : Ra = 0 ? 0 : 1

**Vector Operation**

for x = 0 to VL - 1

if (Vm[x]) Vt[x] = Va[x] < 0 ? –1 : Va[x]=0 ? 0 : 1

### SLL –Shift Left Logical

**Description**:

Left shift an operand value by an operand value and place the result in the target register. Zeros are shifted into the least significant bits. The first operand must be in a register specified by the Ra. The second operand may be either a register specified by the Rb field of the instruction, or an immediate value.

**Instruction Formats**: R2

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 19h6 | Rb6 | Ra6 | Rt6 | 02h8 |

**Instruction Formats**: R2I

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 1Ah6 | Imm6 | Ra6 | Rt6 | 02h8 |

**Vector Instruction Formats**: R2

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 19h6 | Vb6 | Va6 | Vt6 | 82h8 |

**Vector Instruction Formats**: R2I

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 1Ah6 | Imm6 | Va6 | Vt6 | 82h8 |

**Vector Mask Instruction Format: R2 (MSLL)**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 17 | 16 14 | 13 11 | 10 8 | 7 0 |
| 0Eh6 | imm6 | 03 | Vma3 | 03 | Vmt3 | 80h8 |

1 clock cycle

**Operation Size:** .o

**Execution Units**: integer ALU

**Exceptions**: none

**Example**:

### SLLP –Shift Left Logical Pair

**Description**:

Left shift a pair of operand values by an operand value and place the result in the target register. The upper 64 bits of the result are placed in the target register. Zeros are shifted into the least significant bits. The operand pair must be in registers specified by the Ra and Rc field of the instruction. The third operand may be either a register specified by the Rb field of the instruction, or an immediate value.

This instruction may also be used to perform a left rotate of a single register by specifying the same register for Ra and Rc.

**Instruction Formats**: R3

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 31 29 | 28 26 | 25 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| DT3 | Rm3 | Rd6 | Rc6 | A | m3 | z | 59h8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 10h6 | Rb6 | Ra6 | Rt6 | 03h8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 11h6 | Imm6 | Ra6 | Rt6 | 03h8 |

**Operation Size:** .o

**Execution Units**: integer ALU

**Exceptions**: none

**Example**:

### SLT – Set if Less Than

**Description:**

The set instruction places a 1 or 0 in the target register based on the relationship between the two source operands. If operand Ra is less than a second operand in either a register (Rb) or a constant supplied in the instruction, then the target register is set to a one, otherwise the target register is set to a zero. The operands are treated as signed values.

The register form of the instruction may also be used to test for greater than by swapping the operands around.

**Instruction Format: RI**

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Ra6 | Rt6 | 28h8 |

**Integer Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Ch6 | Rb6 | Ra6 | Rt6 | 02h8 |

**Integer Vector Instruction Format: RI**

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Ra6 | Rt6 | A8h8 |

**Integer Vector Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Ch6 | Rb6 | Ra6 | Rt6 | 82h8 |

**Float Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 12h6 | Rb6 | Ra6 | Rt6 | 35h8 |

**Float Vector Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 12h6 | Rb6 | Ra6 | Rt6 | B5h8 |

**Float Vector Set Mask Register:**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 11 | 10 8 | 7 0 |
| 36h6 | Rb6 | Ra6 | 23 | Vmt3 | B5h8 |

### SLE – Set if Less Than or Equal

**Description:**

The set instruction places a 1 or 0 in the target register based on the relationship between the two source operands. If operand Ra is less than or equal to a second operand in register (Rb) then the target register is set to a one, otherwise the target register is set to a zero. The operands are treated as signed values.

There is no immediate form to this instruction. An immediate equivalent may be achieved using the SLT instruction and adjusting the constant by one.

**Instruction Format: R2**

**Integer:**

The integer register form of instruction is an alternate mnemonic for [SGE](#_SGE_–_Set) where the operands have been swapped.

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Dh6 | Ra6 | Rb6 | Rt6 | 02h8 |

**Float:**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 13h6 | Rb6 | Ra6 | Rt6 | 35h8 |

**Integer Vector:**

The integer register form of instruction is an alternate mnemonic for [SGE](#_SGE_–_Set) where the operands have been swapped.

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Dh6 | Va6 | Vb6 | Vt6 | 82h8 |

**Float Vector:**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 13h6 | Vb6 | Va6 | Vt6 | B5h8 |

**Float Vector Set Mask Register:**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 11 | 10 8 | 7 0 |
| 36h6 | Vb6 | Va6 | 33 | Vmt3 | B5h8 |

### SLEU – Set if Less Than or Equal

**Description:**

The set instruction places a 1 or 0 in the target register based on the relationship between the two source operands. If operand Ra is less than or equal to a second operand in register (Rb) then the target register is set to a one, otherwise the target register is set to a zero. The operands are treated as unsigned values.

This instruction is an alternate mnemonic for the SGEU instruction where the operands have been swapped.

There is no immediate form to this instruction. An immediate equivalent may be achieved using the SLTU instruction and adjusting the constant by one.

**Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Fh6 | Ra6 | Rb6 | Rt6 | 02h8 |

### SLTU – Set if Less Than Unsigned

**Description:**

The set instruction places a 1 or 0 in the target register based on the relationship between the two source operands. If operand Ra is less than a second operand in either a register (Rb) or a constant supplied in the instruction, then the target register is set to a one, otherwise the target register is set to a zero. The operands are treated as unsigned values.

The register form of the instruction may also be used to test for greater than by swapping the operands around.

**Instruction Format: RI**

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Ra6 | Rt6 | 2Ah8 |

**Integer Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Eh6 | Rb6 | Ra6 | Rt6 | 02h8 |

**Integer Vector Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 2Eh6 | Ra6 | Rb6 | Rt6 | 82h8 |

### SNE – Set if Not Equal

**Description:**

The set instruction places a 1 or 0 in the target register based on the relationship between the two source operands. If operand Ra is not equal to a second operand in register (Rb) or an immediate constant then the target register is set to a one, otherwise the target register is set to a zero.

For floating-point operations positive and negative zero are considered equal.

**Integer Instruction Format: RI**

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Ra6 | Rt6 | 27h8 |

**Integer Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 27h6 | Rb6 | Ra6 | Rt6 | 02h8 |

**Integer Vector Instruction Format: RI**

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Va6 | Vt6 | A7h8 |

**Integer Vector Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 27h6 | Vb6 | Va6 | Vt6 | 82h8 |

**Integer Vector Set Mask Register:**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 11 | 10 8 | 7 0 |
| 36h6 | Vb6 | Va6 | 13 | Vmt3 | 82h8 |

**Float Vector Set Mask Register:**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 11 | 10 8 | 7 0 |
| 36h6 | Vb6 | Va6 | 13 | Vmt3 | B5h8 |

### SQRT – Square Root

**Description:**

This instruction takes the square root of a register and places the result in a target register.

**Integer Instruction Format: R1**

Both the source and target registers are treated as integer values.

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 86 | ~6 | Ra6 | Rt6 | 01h8 |

**Integer Vector Format: R1**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 26 | 2524 | 23 21 | 20 | 19 14 | 13 8 | 7 0 |
| 08h6 | ~2 | m3 | z | Va6 | Vt6 | 81h8 |

**Float Instruction Format: R1**

Both the source and target registers are treated as float values.

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 08h6 | ~6 | Ra6 | Rt6 | 34h8 |

**Float Vector Format: R1**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 26 | 2524 | 23 21 | 20 | 19 14 | 13 8 | 7 0 |
| 08h6 | ~2 | m3 | z | Va6 | Vt6 | B4h8 |

### SRA –Shift Right Arithmetic Pair

**Description**:

This is an alternate mnemonic for the signed field extract [EXT](#_EXT_–Extract_Bitfield) instruction.

Right shift a pair of operand values by an operand value and place the result in the target register. The lower 64 bits of the result are placed in the target register. The sign bit is shifted into the most significant bits. The operand pair must be in registers specified by the Ra and Rb field of the instruction. The third operand may be either a register specified by the Rc field of the instruction, or an immediate value.

**Instruction Format**: R4

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 31 29 | 28 26 | 25 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| DT3 | Rm3 | 636 | Rc6 | A | m3 | z | 59h8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 46 | Rb6 | Ra6 | Rt6 | 1Ch8 |

|  |  |
| --- | --- |
| DT3 | Meaning |
| 10 | Rd is a six bit immediate, Rc is a reg |
| 11 | Both Rc, Rd are six bit immediates |

**Operation Size:** .o

**Execution Units**: integer ALU

**Exceptions**: none

**Example**:

### SRL –Shift Right Logical

**Description**:

Right shift an operand value by an operand value and place the result in the target register. Zeros are shifted into the most significant bits. The first operand must be in a register specified by the Ra. The second operand may be either a register specified by the Rb field of the instruction, or an immediate value.

**Instruction Formats**: R2

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 21h6 | Rb6 | Ra6 | Rt6 | 02h8 |

**Instruction Formats**: R2I

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 22h6 | Imm6 | Ra6 | Rt6 | 02h8 |

**Vector Mask Instruction Format: R2 (MSRL)**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 17 | 16 14 | 13 11 | 10 8 | 7 0 |
| 0Fh6 | imm6 | 03 | Vma3 | 03 | Vmt3 | 80h8 |

1 clock cycle

**Operation Size:** .o

**Execution Units**: integer ALU

**Exceptions**: none

**Example**:

### SRLP –Shift Right Logical Pair

**Description**:

This is an alternate mnemonic for the unsigned field extract [EXTU](#_EXTU_–Extract_Bitfield) instruction.

Right shift a pair of operand values by an operand value and place the result in the target register. The lower 64 bits of the result are placed in the target register. Zeros are shifted into the most significant bits. The operand pair must be in registers specified by the Ra and Rb field of the instruction. The third operand may be either a register specified by the Rc field of the instruction, or an immediate value.

This instruction may also be used to perform a right rotate of a single register by specifying the same register for Ra and Rb.

**Instruction Format**: R4

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 31 29 | 28 26 | 25 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| DT3 | Rm3 | 636 | Rc6 | A | m3 | z | 59h8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 56 | Rb6 | Ra6 | Rt6 | 1Ch8 |

|  |  |
| --- | --- |
| DT3 | Meaning |
| 10 | Rd is a six bit immediate, Rc is a reg |
| 11 | Both Rc, Rd are six bit immediates |

**Operation Size:** .o

**Execution Units**: integer ALU

**Exceptions**: none

**Example**:

### SUB - Subtract

**Description:**

Subtract two values. Both operands must be in a register.

**Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 56 | Rb6 | Ra6 | Rt6 | 02h8 |

**Scalar Operation**

Rt = Ra - Rb

**Vector Operation**

for x = 0 to VL - 1

if (Vm[x]) Vt[x] = Va[x] - Vb[x]

else if (z) Vt[x] = 0

else Vt[x] = Vt[x]

### SUBF – Subtract From

**Description:**

Subtract two values. The first operand must be in a register. The second operand must be an immediate value specified in the instruction. There is no register form for this instruction.

**Instruction Format: RI**

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Ra6 | Rt6 | 05h8 |

**Operation:**

Rt = Imm - Ra

**Exceptions:** none

### U21NDX – UTF21 Index

**Description:**

This instruction searches Ra, which is treated as an array of three UTF21 values, for a value specified by Rb and places the index of the value into the target register Rt. If the UTF21 value is not found -1 is placed in the target register. A common use would be to search for a null. The index result may vary from -1 to +2. The index of the first found value is returned (closest to zero).

**Integer Instruction Format: RI**

The RI instruction format may be used with an immediate extension word for full 21-bit constants.

|  |  |  |  |
| --- | --- | --- | --- |
| 31 20 | 19 14 | 13 8 | 7 0 |
| Constant12 | Ra6 | Rt6 | 23h8 |

1 clock cycle

**Instruction Format:** R2

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 23h6 | Rb6 | Ra6 | Rt6 | 02h8 |

**Supported Formats**: .o

**Clock Cycles:** 1

**Execution Units:** Integer ALU

**Operation:**

Rt = Index of (Rb in Ra)

**Exceptions:** none

### WYDNDX – Wyde Index

**Description:**

This instruction searches Ra, which is treated as an array of four wydes, for a wyde value specified by Rb and places the index of the wyde into the target register Rt. If the wyde is not found -1 is placed in the target register. A common use would be to search for a null wyde. The index result may vary from -1 to +3. The index of the first found wyde is returned (closest to zero).

**Integer Instruction Format: RI**

The RI instruction format may be used with an immediate extension word for full 16-bit constants.

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 22 | 2120 | 19 15 | 14 13 | 12 8 | 7 | 6 0 |
| Constant14 | Ta2 | Ra5 | Tt2 | Rt5 | v | 1Bh7 |

1 clock cycle

**Instruction Format:** R2

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 26 22 | 2120 | 19 15 | 14 13 | 12 8 | 7 | 6 0 |
| 1Bh7 | Tb2 | Rb5 | Ta2 | Ra5 | Tt2 | Rt5 | v | 02h7 |

**R2 Supported Formats**: .o

**Clock Cycles:** 1

**Execution Units:** Integer ALU

**Operation:**

Rt = Index of (Rb in Ra)

**Exceptions:** none

### XOR – Bitwise Exclusive Or

**Description:**

Perform a bitwise exclusive or operation between operands. The first operand must be in a register. The second operand may be a register or immediate value. A third operand must be in a register. The immediate constant is zero extended before use.

**Integer Instruction Format: RI**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| Constant16 | Ta | Ra5 | Tt | Rt5 | v | 0Ah7 |

1 clock cycle / N clock cycles (N = vector length)

**Integer Instruction Format: R2**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 | 6 0 |
| 0Ah7 | ~2 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | v | 02h7 |

1 clock cycle / N clock cycles (N = vector length)

**Vector Mask Instruction Format: R2 (MADD)**

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 28 25 | 24 22 | 21 18 | 17 15 | 14 11 | 10 8 | 7 0 |
| 02h7 | 04 | Vmb3 | 04 | Vma3 | 04 | Vmt3 | 3Eh8 |

1 clock cycle

**Operation**

**Rt = Ra ^ Immediate**

**OR**

**Rt = Ra ^ Rb**

**Vector Operation**

for x = 0 to VL-1

if (Vm[x]) Vt[x] = Va[x] ^ Vb[x] ^ Vc[x]

else if (z) Vt[x] = 0

else Vt[x] = Vt[x]

**Exceptions**: none

### ZXB –Zero Extend Byte

**Description**:

Zero extend byte.

**Integer Instruction Format: R1**

Both the source and target registers are treated as integer values.

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 0Ch6 | ~6 | Ra6 | Rt6 | 01h8 |

**Clock Cycles**: 1

**Execution Units:** Integer ALU

**Exceptions**: none

**Notes**:

### ZXW –Zero Extend Wyde

**Description**:

**Integer Instruction Format: R1**

Both the source and target registers are treated as integer values.

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 0Dh6 | ~6 | Ra6 | Rt6 | 01h8 |

**Integer Vector Format: R1**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 26 | 2524 | 23 21 | 20 | 19 14 | 13 8 | 7 0 |
| 0Dh6 | ~2 | m3 | z | Va6 | Vt6 | 81h8 |

**Clock Cycles**: 1

**Execution Units:** Integer ALU

**Exceptions**: none

**Notes**:

### ZXT –Zero Extend Tetra

**Description**:

**Integer Instruction Format: R1**

Both the source and target registers are treated as integer values.

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 0Eh6 | ~6 | Ra6 | Rt6 | 01h8 |

**Integer Vector Format: R1**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 26 | 2524 | 23 21 | 20 | 19 14 | 13 8 | 7 0 |
| 0Eh6 | ~2 | m3 | z | Va6 | Vt6 | 81h8 |

**Clock Cycles**: 1

**Execution Units:** Integer ALU

**Exceptions**: none

**Notes**:

## Graphics

### BLEND – Blend Colors

**Description**:

This instruction blends two colors whose values are in Ra and Rb according to an alpha value in Rc. The resulting color is placed in register Rt. The alpha value is an eight-bit value assumed to be a binary fraction less than one. The color values in Ra and Rb are assumed to be RGB888 format colors. The result is a RGB888 format color. The high order eight bits of the result register are set to the high order eight bits of Ra. Note that a close approximation to 1.0 – alpha is used. Each component of the color is blended.

**Instruction Format**: R3

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 31 29 | 28 26 | 25 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| DT3 | 03 | 06 | Rc6 | 0 | m3 | z | 58h8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 30h6 | Rb6 | Ra6 | Rt6 | 03h8 |

**Operation**:

Rt.R = (Ra.R \* alpha) + (Rb.R \* ~alpha)

Rt.G = (Ra.G \* alpha) + (Rb.G \* ~alpha)

Rt.B = (Ra.B \* alpha) + (Rb.B \* ~alpha)

**Clock Cycles**: 2

### TRANSFORM – Transform Point

**Description:**

The point transform instruction transforms a point from one location to another using a transform function. The transform function has 12 co-efficients in the form of a matrix to used in the calculation.

Points are represented in 16.16 fixed-point format.

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 11h9 | 06 | Ra6 | Rt6 | 01h8 |

**Clock Cycles**: 2

### RW\_COEEF – Read/Write Co-efficient

**Description:**

RW\_COEFF reads and writes a coefficient value to be used for the transform matrix. Ra contains the number of the coefficient to read or write. Rb contains the new value for the coefficient.

**Instruction Format: R2**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 3Eh9 | Rb6 | Ra6 | Rt6 | 02h8 |

**Co-efficient Matrix:**

|  |  |  |  |
| --- | --- | --- | --- |
| AA | AB | AC | AT |
| BA | BB | BC | BT |
| CA | CB | CC | CT |

|  |  |
| --- | --- |
| Regno in Ra | Coefficient Accessed |
| 0 | AA |
| 1 | AB |
| 2 | AC |
| 3 | AT |
| 4 | BA |
| 5 | BB |
| 6 | BC |
| 7 | BT |
| 8 | CA |
| 9 | CB |
| 10 | CC |
| 11 | CT |
| 12 | CMD – bit 0, 1=transform, 0 = pass through |

## Memory Operations

### CACHE – Cache Command

CACHE Cmd, d[Rn]

**Description:**

This instruction commands the cache controller to perform an operation. Commands are summarized in the command table below. Commands may be issued to both the instruction and data cache at the same time. The address of the cache line to be invalidated is passed in Ra if needed.

**Instruction Formats**: CACHE

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 28 | 27 20 | 19 14 | 13 11 | 10 8 | 7 0 |
| 153..0 | Const8 | Ra6 | DC3 | IC3 | 60h8 |

**Commands:**

|  |  |  |
| --- | --- | --- |
| IC3 | Mne. | Operation |
| 0 | NOP | no operation |
| 1 | invline | invalidate line associated with given address |
| 2 | invall | invalidate the entire cache (address is ignored) |
| 3 to 7 |  | reserved |

|  |  |  |
| --- | --- | --- |
| DC3 | Mne. | Operation |
| 0 | NOP | no operation |
| 1 | enable | enable cache (instruction cache is always enabled) |
| 2 | disable | not valid for the instruction cache |
| 3 | invline | invalidate line associated with given address |
| 4 | invall | invalidate the entire cache (address is ignored) |
| 5 to 7 |  | reserved |

Notes:

### LDx – Load

**Description**:

Load a value from memory into a register.

**Formats Supported**:

**Register Indirect with Displacement**

This mode may make use of immediate prefixes to extend the range.

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 32 | 31 20 | 19 | 18 14 | 13 | 12 8 | 7 0 |
| Func3..0 | Const12 | Ta | Ra5 | Tt | Rt5 | 60h8 |

**Scalar Indexed Form (LD)**

The effective address (EA) is calculated as the sum of Ra plus Rb multiplied by a scale.

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 32 | 3130 | 29 27 | 2625 | 24 20 | 19 | 18 14 | 13 | 12 8 | 7 0 |
| Func4 | ~2 | Sc3 | Tb2 | Rb5 | Ta | Ra5 | Tt | Rt5 | 61h8 |

z: 1= zero extend, 0 = sign extend

|  |  |
| --- | --- |
| S | Multiplier |
| 0 | 1 |
| 1 | operand size |

***Operation:***

Rt = Memory[d + Ra + Rb \* Sc]

**Vector forms**

**Stridden Form (LDS)**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 21 | 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| Const11 | I | Rc6 | A | m3 | z | 5Ch8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 28 | 27 20 | 19 14 | 13 8 | 7 0 |
| Func3..0 | Const8 | Ra6 | Rt6 | E2h8 |

Data is loaded from memory addresses separated by the stride amount specified by register field Rc, beginning with the sum of Ra and an immediate value. If the vector mask bit is clear and the ‘z’ bit is set in the instruction then the corresponding element of the vector register is loaded with zero. If the vector mask bit is clear and the ‘z’ bit is clear in the instruction then the corresponding element of the vector register is left unchanged (no value is loaded from memory).

Elements are loaded only up to the length specified in the vector length register.

|  |  |  |
| --- | --- | --- |
| Vm[x] | z | Result |
| 0 | 0 | Vt[x] = Vt[x] (unchanged) |
| 0 | 1 | Vt[x] = 0 (set to zero) |
| 1 | 0 | Vt[x] = memory, sign extended |
| 1 | 1 | Vt[x] = memory, zero extended |

|  |  |
| --- | --- |
| Func4 | Operation Size |
| 0 | byte |
| 1 | wyde |
| 2 | tetra |
| 3 | octa |
| 4 | hexi (double octa) |
| 5 | quad octa |
| 6 | reserved |
| 7 | pointer |
| … | reserved |
| 15 | cache cmd |

***Operation:***

for x = 0 to vector length

if (Vm[x])

Vt[x] = Memory[d+Ra + Rb \* x]

else

Vt[x] = z ? 0 : Vt[x]

**Indexed Form**

Data is loaded from memory addresses beginning with the sum of Ra and a vector element from Vc.

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 21 | 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| Const11 | 0 | Vc6 | A | m3 | z | DCh8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 28 | 27 20 | 19 14 | 13 8 | 7 0 |
| Func3..0 | Const8 | Ra6 | Rt6 | E3h8 |

***Operation:***

n = 0

for x = 0 to vector length

if (Vm[x])

Vt[x] = Memory[d + Ra + Vb[x]]

else

Vt[x] = z ? 0 : Vt[x]

**Exceptions**: none

### LDB – Load Byte (8 bits)

**Description**:

Data is loaded from the memory address which is the sum of an immediate value and the sum of Ra and Rb times a scale. The value loaded is sign extended from bit 7 to the machine width.

**Formats Supported**: LD

**Operation:**

Rt = Memory8[d + Ra + Rb\*Sc]

**Exceptions**: none

### LDBZ – Load Byte, Zero Extend (8 bits)

**Description**:

Data is loaded from the memory address which is the sum of an immediate value and the sum of Ra and Rb times a scale. The value loaded is zero extended from bit 8 to the machine width.

**Formats Supported**: LD

**Operation:**

Rt = Memory8[d + Ra + Rb\*Sc]

**Exceptions**: none

### LDO – Load Octa (64 bits)

**Description**:

Data is loaded into Rt from the memory address which is the sum of an immediate value and the sum of Ra and Rb scaled.

**Formats Supported**: RR,RI

**Operation:**

Rt = Memory64[d + Ra + Rb\*Sc]

**Execution Units**: Mem

**Exceptions**: none

### LDT – Load Tetra (32 bits)

**Description**:

Data is loaded from the memory address which is the sum of Ra and an immediate value or the sum of Ra and Rb scaled. The value loaded is sign extended from bit 31 to the machine width.

**Formats Supported**: RR,RI

**Operation:**

Rt = Memory32[d + Ra + Rb\*Sc]

**Execution Units**: Mem

**Exceptions**: none

### LDTZ – Load Tetra, Zero Extend (32 bits)

**Description**:

Data is loaded from the memory address which is the sum of Ra and an immediate value or the sum of Ra and Rb scaled. The value loaded is zero extended from bit 8 to the machine width.

**Formats Supported**: RR,RI

**Operation:**

Rt = Memory32[d + Ra + Rb\*Sc]

**Execution Units**: Mem

**Exceptions**: none

### LDW – Load Wyde (16 bits)

**Description**:

Data is loaded from the memory address which is the sum of Ra and an immediate value or the sum of Ra and Rb scaled. The value loaded is sign extended from bit 15 to the machine width.

**Formats Supported**: LD

**Operation:**

Rt = Memory16[d + Ra + Rb\*Sc]

**Execution Units**: Mem

**Exceptions**: none

### LDWZ – Load Wyde, Zero Extend (16 bits)

**Description**:

Data is loaded from the memory address which is the sum of Ra and an immediate value or the sum of Ra and Rb scaled. The value loaded is zero extended from bit 16 to the machine width.

**Formats Supported**: LD

**Operation:**

Rt = Memory16[d + Ra + Rb\*Sc]

**Execution Units**: Mem

**Exceptions**: none

### LEA – Load Effective Address

**Description**:

This instruction computes the effective address for a load/store operation. The data type tag for the target register is set to indicate it contains a pointer.

**Formats Supported**:

**Scalar Indexed Form (LD)**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 28 | 27 | 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| Func3..0 | 1 | S | Rb6 | Ra6 | Rt6 | 61h8 |

***Operation:***

Rt = d + Ra + Rb \* Sc

**Vector forms**

**Stridden Form (LDS)**

|  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 63 50 | 4948 | 47 44 | 4341 | 40 | 39 32 | 31 24 | 23 16 | 15 8 | 7 0 |
| Const21..8 | U2 | Sz4 | m3 | z | Const7..0 | Rb8 | Ra8 | Rt8 | 69h8 |

|  |  |  |
| --- | --- | --- |
| Vm[x] | z | Result |
| 0 | 0 | Vt[x] = Vt[x] (unchanged) |
| 0 | 1 | Vt[x] = 0 (set to zero) |
| 1 | 0 | Vt[x] = memory address |
| 1 | 1 | Vt[x] = memory address |

|  |  |
| --- | --- |
| U2 | Unit |
| 0 | integer |
| 1 | floating-point |
| 2 | decimal-float |
| 3 | posit |

|  |  |
| --- | --- |
| Sz4 | Operation Size |
| 0 | byte |
| 1 | wyde |
| 2 | tetra |
| 3 | octa |
| 4 | hexi |
|  |  |

***Operation:***

for x = 0 to vector length

if (Vm[x])

Vt[x] = d + Ra + Rb \* x

else

Vt[x] = z ? 0 : Vt[x]

**Indexed Form**

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 63 48 | 47 44 | 4341 | 40 | 39 32 | 31 24 | 23 16 | 15 8 | 7 0 |
| Const23..8 | Sz4 | m3 | z | Const7..0 | Vb8 | Ra8 | Rt8 | 6Ah8 |

***Operation:***

n = 0

for x = 0 to vector length

if (Vm[x])

Vt[x] = d + Ra + Vb[x]

else

Vt[x] = z ? 0 : Vt[x]

**Exceptions**: none

### STx – Store

**Description**:

Store values to memory. Either the contents of a scalar or vector register or a six-bit immediate constant may be stored. Both scalar and vector store operations are possible.

**Formats Supported**:

**Register Indirect with Displacement**

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 35 32 | 3127 | 2625 | 24 20 | 19 | 18 14 | 13 8 | 7 0 |
| Func3..0 | C5 | Tb2 | Rb5 | Ta | Ra5 | Const6 | 70h8 |

**Scalar Indexed Form (ST)**

The effective address (EA) is calculated as the sum of Ra plus Rc multiplied by a scale.

|  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 31 | 30 28 | 27 | 2625 | 24 20 | 19 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| ~5 | Rm3 | Tc | Td2 | Rd6 | Tc | Rc5 | A | m3 | z | 58h8 |

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 32 | 3130 | 29 27 | 2625 | 24 20 | 19 | 18 14 | 13 8 | 7 0 |
| Func4 | ~2 | Sc3 | Tb2 | Rb6 | Ta | Ra6 | ~6 | 71h8 |

|  |  |
| --- | --- |
| Sc | Multiplier |
| 0 | 1 |
| 1 | Store size |

***Operation:***

Memory[d+Ra + Rb \* Sc] = Rs

**Vector forms**

**Stridden Form (STS)**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 21 | 2019 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| Const15 | Tc2 | Rc5 | A | m3 | z | 5Ch8 |

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 28 | 2726 | 25 20 | 19 14 | 13 8 | 7 0 |
| Func3..0 | C2 | Rb6 | Ra6 | Const6 | F2h8 |

Data is stored to memory addresses separated by the stride amount specified by register field Rc, beginning with the sum of Ra and an immediate value. If the vector mask bit is clear and the ‘z’ bit is set in the instruction then memory for the corresponding element of the vector register is stored with zero. If the vector mask bit is clear and the ‘z’ bit is clear in the instruction then memory corresponding to the element of the vector register is left unchanged (no value is stored to memory).

Elements are loaded only up to the length specified in the vector length register.

|  |  |  |
| --- | --- | --- |
| Vm[x] | z | Result |
| 0 | 0 | Memory = Memory (unchanged) |
| 0 | 1 | Memory = 0 (set to zero) |
| 1 | 0 | memory = Vt[x] |
| 1 | 1 | memory = Vt[x] |

|  |  |
| --- | --- |
| Sz4 | Operation Size |
| 0 | byte |
| 1 | wyde |
| 2 | tetra |
| 3 | octa |
| 4 | hexi |
| 5,6 | reserved |
| 7 | pointer |

***Operation:***

for x = 0 to vector length

if (Vm[x])

Memory[d+Ra + Rb \* x] = Vt[x]

else

Memory[d+Ra + Rb \* x] = z ? 0 : Memory[d+Ra + Rb \* x]

**Indexed Form**

Data is stored to memory addresses beginning with the sum of Ra and a vector element from Vb.

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 21 | 2019 | 18 14 | 1312 | 11 9 | 8 | 7 0 |
| Const15 | Tc2 | Rc5 | A | m3 | z | 5Ch8 |

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 28 | 2726 | 25 20 | 19 14 | 13 8 | 7 0 |
| Func3..0 | C2 | Rb6 | Ra6 | Const6 | F3h8 |

***Operation:***

n = 0

for x = 0 to vector length

if (Vm[x])

Memory[d + Ra + Vb[x]] = Vt[x]

else

Memory = z ? 0 : Memory

**Exceptions**: none

### STB – Store Byte (8 bits)

**Description:**

This instruction stores a byte (8 bit) value to memory.

**Instruction Format**: ST

**Register Indirect Operation:**

Memory8[d + Ra] = Rb

**Indexed Operation:**

Memory8[Ra + Rc\*Sc] = Rb

### STBZ – Store Byte and Zero (8 bits)

**Description:**

This instruction stores a byte (8 bit) value to memory. After the byte is stored to memory the register is zeroed out.

**Instruction Format**: ST

**Register Indirect Operation:**

Memory8[d + Ra] = Rb

Rb = 0

**Indexed Operation:**

Memory8[Ra + Rc\*Sc] = Rb

Rb = 0

### STO – Store Octa (64 bits)

**Description:**

This instruction stores an octa-byte (64 bit) value to memory. The memory address is calculated as the sum of an immediate constant and the sum of Ra and Rb scaled.

**Instruction Format:** ST

**Operation:**

Memory64[d + Ra + Rb\*Sc] = Rs

### STOZ – Store Octa and Zero (64 bits)

**Description:**

This instruction stores an octa-byte (64 bit) value to memory. The memory address is calculated as the sum of an immediate constant and the sum of Ra and Rb scaled. After the tetra is stored to memory the register is zeroed out.

**Instruction Format:** ST

**Operation:**

Memory64[d + Ra + Rb\*Sc] = Rs

Rs = 0

### STPTR – Store Pointer (64 bits)

**Description:**

This instruction stores an octa-byte (64 bit) value to memory. The memory address is calculated as the sum of an immediate constant and the sum of Ra and Rb scaled. STPTR begins a series of stores to memory addresses scaled by eight bits, until the address zero is reached. The first store proceeds normally, for the second and subsequent stores a byte store operation takes place with the value zero being to memory.

The purpose of the STPTR instruction is to allow a code dense implementation of a write barrier that indicates where in memory a pointer is stored with increasing resolution.

This instruction assumes that card memory used to record pointer locations is located at the low end of the memory system.

**Instruction Format:** ST

**Operation:**

ea = d + Ra + Rb\*Sc

Memory64[ea] = Rs

while ea <> 0

ea = ea >> 8

Memory8[ea] = 0

### STT – Store Tetra (32 bits)

**Description:**

This instruction stores a tetra-byte (32 bit) value to memory. The memory address is calculated as the sum of an immediate constant and the sum of Ra and Rb scaled.

**Instruction Format:** ST

**Operation:**

Memory32[d + Ra + Rb\*Sc] = Rs

### STTZ – Store Tetra and Zero (32 bits)

**Description:**

This instruction stores a tetra-byte (32 bit) value to memory. The memory address is calculated as the sum of an immediate constant and the sum of Ra and Rb scaled. After the tetra is stored to memory the register is zeroed out.

**Instruction Format:** ST

**Operation:**

Memory32[d + Ra + Rb\*Sc] = Rs

Rs = 0

### STW – Store Wyde (16 bits)

**Description:**

This instruction stores a byte (16 bit) value to memory. The memory address is calculated as the sum of an immediate constant and the sum of Ra and Rb scaled.

**Instruction Format:** ST

**Operation:**

Memory16[d + Ra + Rb\*Sc] = Rs

### STWZ – Store Wyde and Zero (16 bits)

**Description:**

This instruction stores a byte (16 bit) value to memory. The memory address is calculated as the sum of an immediate constant and the sum of Ra and Rb scaled. After the wyde is stored to memory the register is zeroed out.

**Instruction Format:** ST

**Operation:**

Memory16[d + Ra + Rb\*Sc] = Rs

Rs = 0

## Flow Control (Branch Unit) Operations

### Branches

#### Displacement

The conditional branch displacement is in terms of instruction count skipped over to increase the range a branch may cover. A displacement of one represents nine nybbles. Code using conditional branches must be sequentially laid out in memory, instructions adjacent to each other with no “holes” in the layout.

The displacement for the branch-and-link instruction is the number of nybbles to the target address from the current one. This allows subroutines to be aligned at any nybble address.

#### Modifier

The branch modifier may be used to make it possible to branch to a target address contained in a register, and to store the return address in a register. Simultaneously the branch displacement is increased to 26 bits allowing a ±150MB branch range.

### BAL – Branch and Link

**Description**:

This instruction may be used to call a subroutine using relative addressing. The address of the instruction after the BAL is stored in the specified return address register (Rt) then a jump to the address specified in the instruction is made. The address range is 26 bits or ±16MB.

The return address register is assumed to be x1 if not otherwise specified. The BAL instruction does not require space in branch predictor tables.

**Formats Supported**: BAL

|  |  |  |
| --- | --- | --- |
| 35 10 | 9 8 | 7 0 |
| Constant26 | Rt2 | 41h8 |

**Flags Affected**: none

**Operation:**

Rt = IP + 9

IP = IP + displacement

**Execution Units**: Branch

**Exceptions**: none

**Notes**:

### BBS – Branch if Bit Set

**Description**:

This instruction branches to the target address if the bit number identified by the Rb specifier in the instruction is set in Ra. Rb may be a value in a register or a six-bit unsigned immediate value. Otherwise, program execution continues with the next instruction. With a branch modifier instruction, the target address is formed as the sum of Rc and a displacement. If Rc is x31 then the instruction pointer value is used. Otherwise, the target address is the sum of the instruction pointer value and the displacement specified in the instruction.

**Formats Supported**: BR

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 26 22 | 2120 | 19 15 | 1413 | 12 8 | 7 | 6 0 |
| Const7 | Tb2 | Rb5 | Ta2 | Ra5 | Ty2 | Const5 | 0 | 4Dh8 |

**Operation:**

If (Ra[Rb])

IP = IP + Displacement12\*9

With Modifier

Rt = IP + 9

If (Ra[Rb])

IP = Rc + Displacement26\*9

**Execution Units**: Branch

**Exceptions**: none

**Notes:**

### BEQ – Branch if Equal

**Description**:

This instruction branches to the target address if the contents of Ra and Rb are equal, otherwise program execution continues with the next instruction. With a branch modifier instruction, the target address is formed as the sum of Rc and a displacement. If Rc is x31 then the instruction pointer value is used. Otherwise, the target address is the sum of the instruction pointer value and the displacement specified in the instruction.

**Formats Supported**: BR

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 35 27 | 2625 | 24 20 | 19 | 18 14 | 13 8 | 7 | 6 0 |
| Const9 | Tb2 | Rb5 | Ta | Ra5 | Const6 | 0 | 4Eh8 |

**Operation:**

If (Ra = Rb)

IP = IP + Displacement15\*9

With Modifier

Rt = IP + 9

If (Ra = Rb)

IP = Rc + Displacement30\*9

**Execution Units**: Branch

**Exceptions**: none

**Notes:**

For a floating-point comparison positive and negative zero are considered equal.

### BGE – Branch if Greater Than or Equal

**Description**:

This instruction branches to the target address if the contents of Ra is greater than or equal to Rb, otherwise program execution continues with the next instruction. The values in Ra and Rb are treated as signed values.

**Formats Supported**: BR

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 26 22 | 2120 | 19 15 | 1413 | 12 8 | 7 | 6 0 |
| Const7 | Tb2 | Rb5 | Ta2 | Ra5 | Ty2 | Const5 | 0 | 49h7 |

**Operation:**

If (Ra >= Rb)

IP = IP + Displacement

**Execution Units**: Branch

**Exceptions:** none

### BGEU – Branch if Greater Than or Equal Unsigned

**Description**:

This instruction branches to the target address if the contents of Ra is greater than or equal to Rb, otherwise program execution continues with the next instruction. The values in Ra and Rb are treated as unsigned values. The target address is formed as the sum of Rc and a displacement. If Rc is x31 then the program counter value is used.

**Formats Supported**: BR

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 26 22 | 2120 | 19 15 | 1413 | 12 8 | 7 | 6 0 |
| Const7 | Tb2 | Rb5 | Ta2 | Ra5 | Ty2 | Const5 | 0 | 4Bh7 |

**Operation:**

Rt = IP + 8

If (Ra >= Rb)

PC = Rc + Displacement

**Execution Units**: Branch

**Exceptions:** none

### BGT – Branch if Greater Than

**Description**:

This instruction is an alternate mnemonic for the [BLT](#_BLT_–_Branch) instruction where the register operands have been swapped.

This instruction branches to the target address if the contents of Ra is less than Rb, otherwise program execution continues with the next instruction. The values in Ra and Rb are treated as signed values. The target address is formed as the sum of Rc and a displacement. If Rc is x31 then the program counter value is used.

**Formats Supported**: BR

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 26 22 | 2120 | 19 15 | 1413 | 12 8 | 7 | 6 0 |
| Const7 | Tb2 | Rb5 | Ta2 | Ra5 | Ty2 | Const5 | 0 | 48h7 |

**Operation:**

Rt = IP + 8

If (Ra < Rb)

PC = Rc + Displacement

**Execution Units**: Branch

**Exceptions**: none

### BGTU – Branch if Greater Than Unsigned

**Description**:

This instruction is an alternate mnemonic for the [BLTU](#_BLTU_–_Branch) instruction where the register operands have been swapped.

This instruction branches to the target address if the contents of Ra is less than Rb, otherwise program execution continues with the next instruction. The values in Ra and Rb are treated as unsigned values. The target address is formed as the sum of Rc and a displacement. If Rc is x31 then the program counter value is used.

**Formats Supported**: BR

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 26 22 | 2120 | 19 15 | 1413 | 12 8 | 7 | 6 0 |
| Const7 | Tb2 | Rb5 | Ta2 | Ra5 | Ty2 | Const5 | 0 | 4Ah7 |

**Operation:**

Rt = IP + 8

If (Ra < Rb)

PC = Rc + Displacement

**Execution Units**: Branch

**Exceptions**: none

### BNE – Branch if Not Equal

**Description**:

This instruction branches to the target address if the contents of Ra and Rb are not equal, otherwise program execution continues with the next instruction.

**Formats Supported**: BR

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 26 22 | 2120 | 19 15 | 1413 | 12 8 | 7 | 6 0 |
| Const7 | Tb2 | Rb5 | Ta2 | Ra5 | Ty2 | Const5 | 0 | 4Fh8 |

**Operation:**

If (Ra <> Rb)

IP = IP + Displacement

**Execution Units**: Branch

**Exceptions**: none

### BLE – Branch if Less Than or Equal

**Description**:

This is an alternate mnemonic for the [BGE](#_BGE_–_Branch) instruction, where the register operands have been swapped.

This instruction branches to the target address if the contents of Ra is greater than or equal to Rb, otherwise program execution continues with the next instruction. The values in Ra and Rb are treated as signed values.

**Formats Supported**: BR

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 26 22 | 2120 | 19 15 | 1413 | 12 8 | 7 | 6 0 |
| Const7 | Ta2 | Ra5 | Tb2 | Rb5 | Ty2 | Const5 | 0 | 49h7 |

**Operation:**

If (Ra >= Rb)

PC = Rc + Displacement

**Execution Units**: Branch

**Exceptions:** none

### BLEU – Branch if Less Than or Equal Unsigned

**Description**:

This is an alternate mnemonic for the [BGEU](#_BGEU_–_Branch) instruction, where the register operands have been swapped.

This instruction branches to the target address if the contents of Ra is greater than or equal to Rb, otherwise program execution continues with the next instruction. The values in Ra and Rb are treated as unsigned values.

**Formats Supported**: BR

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 26 22 | 2120 | 19 15 | 1413 | 12 8 | 7 | 6 0 |
| Const7 | Ta2 | Ra5 | Tb2 | Rb5 | Ty2 | Const5 | 0 | 4Bh7 |

**Operation:**

If (Ra >= Rb)

IP = IP + Displacement

**Execution Units**: Branch

**Exceptions:** none

### BLT – Branch if Less Than

**Description**:

This instruction branches to the target address if the contents of Ra is less than Rb, otherwise program execution continues with the next instruction. The values in Ra and Rb are treated as signed values.

**Formats Supported**: BR

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 26 22 | 2120 | 19 15 | 1413 | 12 8 | 7 | 6 0 |
| Const7 | Tb2 | Rb5 | Ta2 | Ra5 | Ty2 | Const5 | 0 | 48h7 |

**Operation:**

If (Ra < Rb)

IP = IP + Displacement

**Execution Units**: Branch

**Exceptions**: none

### BLTU – Branch if Less Than Unsigned

**Description**:

This instruction branches to the target address if the contents of Ra is less than Rb, otherwise program execution continues with the next instruction. The values in Ra and Rb are treated as unsigned values.

**Formats Supported**: BR

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 35 29 | 2827 | 26 22 | 2120 | 19 15 | 1413 | 12 8 | 7 | 6 0 |
| Const7 | Tb2 | Rb5 | Ta2 | Ra5 | Ty2 | Const5 | 0 | 4Ah7 |

**Operation:**

if (Ra < Rb)

IP = IP + Displacement

**Execution Units**: Branch

**Exceptions**: none

### BRA – Unconditional Branch

**Description**:

This instruction is an alternate mnemonic for the [BAL](#_BAL_–_Branch) instruction. The address range is 26 bits or ±16MB.

**Formats Supported**: JAL

|  |  |  |
| --- | --- | --- |
| 35 10 | 9 8 | 7 0 |
| Constant26 | 02 | 41h8 |

**Flags Affected**: none

**Operation:**

IP = IP + Displacement

**Execution Units**: Branch

**Exceptions**: none

**Notes**:

### BSR – Unconditional Branch to Subroutine

**Description**:

This instruction is an alternate mnemonic for the [BAL](#_BAL_–_Branch) instruction. The address range is 26 bits or ±16MB.

**Formats Supported**: JAL

|  |  |  |
| --- | --- | --- |
| 35 10 | 9 8 | 7 0 |
| Constant26 | 12 | 41h8 |

**Flags Affected**: none

**Operation:**

Rt = IP + 9

IP = IP + Displacement

**Execution Units**: Branch

**Exceptions**: none

**Notes**:

### CHK – Check Register Against Bounds

**Description**:

A register is compared to two values. If the register is outside of the bounds then an exception will occur.

**Instruction Format: RI**

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 31 29 | 28 26 | 25 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| DT3 | Rm3 | Rd6 | Rc6 | A | m3 | z | F7h8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 20 | 19 14 | 13 10 | 9 8 | 7 0 |
| Constant12 | Ra6 | ~ | Cn2 | 22h8 |

|  |  |
| --- | --- |
| Cn2 | Interpretation |
| 0 | Ra <= Rc <= Constant |
| 1 | Ra < Rc <= Constant |
| 2 | Ra <= Rc < Constant |
| 3 | Ra < Rc < Constant |

**Instruction Format**: R3

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 31 29 | 28 26 | 25 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| DT3 | Rm3 | Rd6 | Rc6 | A | m3 | z | F7h8 |

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 10 | 9 8 | 7 0 |
| 22h6 | Rb6 | Ra6 | ~ | Cn2 | 03h8 |

|  |  |
| --- | --- |
| Cn2 | Interpretation |
| 0 | Ra <= Rb <= Rc |
| 1 | Ra < Rb <= Rc |
| 2 | Ra <= Rb < Rc |
| 3 | Ra < Rb < Rc |

**Supported Formats**: .o

**Clock Cycles**: 2

**Execution Units:** Integer ALU, Float, Decimal Float, Posit

**Exceptions**: bounds check

**Notes**:

The system exception handler will typically transfer processing back to a local exception handler.

### JAL – Jump and Link

**Description**:

This instruction may be used to both call a subroutine and return from it. The address of the instruction after the JAL is stored in the specified return address register (Rt) then a jump to the address specified in the instruction is made. The address range is 26 bits or 16MB.

The return address register is assumed to be x1 if not otherwise specified. The JAL instruction does not require space in branch predictor tables.

**Formats Supported**: JAL

|  |  |  |
| --- | --- | --- |
| 35 10 | 9 8 | 7 0 |
| Constant26 | Rt2 | 40h8 |

**Flags Affected**: none

**Operation:**

Rt = IP + 9

IP = displacement

**Execution Units**: Branch

**Exceptions**: none

**Notes**:

### JALR – Jump and Link to Register

**Description**:

This instruction may be used to both call a subroutine and return from it. The sum of the current IP and a small constant is stored in the specified return address register (Rt) then a jump to the address specified in the instruction plus an index register value is made.

The return address register is assumed to be x1 if not otherwise specified. The JALR instruction does not require space in branch predictor tables.

If x31 is specified for Ra then the current instruction pointer value is used.

**Formats Supported**: JALR

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 35 22 | 2120 | 19 15 | 14 10 | 9 8 | 7 0 |
| Constant14 | Ta2 | Ra5 | Cnst5 | Rt2 | 42h8 |

**Flags Affected**: none

**Operation:**

Rt = IP + Cnst5\*2

If Ra=31

IP = IP + displacement

Else

IP = Ra + Displacement

**Execution Units**: Branch

**Exceptions**: none

**Notes**:

### JMP – Jump

**Description**:

This instruction is an alternate mnemonic for the [JAL](#_JAL_–_Jump) instruction. It may be used to jump directly to a specific address. The address range is 26 bits or 16MB.

The return address register is assumed to be x0 (discarding the return address). The JMP instruction does not require space in branch predictor tables.

**Formats Supported**: JAL

|  |  |  |
| --- | --- | --- |
| 35 10 | 9 8 | 7 0 |
| Constant26 | 02 | 40h8 |

**Flags Affected**: none

**Operation:**

IP = displacement

**Execution Units**: Branch

**Exceptions**: none

**Notes**:

### RET – Return from Subroutine

**Description**:

This instruction is an alternate mnemonic for the [JALR](#_JALR_–_Jump) instruction. Register Ra is assumed to be x1 and register Rt is assumed to be x0. The constant is assumed to be zero.

**Formats Supported**: JALR

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 35 22 | 2120 | 19 15 | 14 13 | 12 8 | 7 0 |
| Constant14 | 02 | 015 | 02 | 05 | 42h8 |

**Flags Affected**: none

**Operation:**

**Execution Units**: Branch

**Exceptions**: an unimplemented instruction exception may occur if a vector register is specified.

**Notes**:

Return address prediction hardware may make use of the RET instruction.

## System Instructions

### BRK – Break

**Description**:

This instruction initiates the processor debug routine. The processor enters debug mode. The cause code register is set to the value specified in the instruction. Interrupts are disabled. The instruction pointer is reset to the contents of tvec[4] and instructions begin executing. There should be a jump instruction placed at the break vector location. The address of the BRK instruction is stored in the EIP register.

**Instruction Format**: BRK

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 35 23 | 22 15 | 1413 | 12 8 | 7 0 |
| ~13 | Cause8 | 02 | 05 | 00h8 |

**Operation:**

PMSTACK = (PMSTACK << 4) | 10

CAUSE = Const8

EIP = IP

IP = tvec[4]

**Execution Units**: Branch

**Clock Cycles**:

**Exceptions**: none

**Notes**:

### CSRx – Control and Special / Status Access

**Description**:

The CSR instruction group provides access to control and special or status registers in the core. For the read operation the current value of the CSR is placed in the target register Rt.

**Instruction Format**: CSR

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 35 22 | 2120 | 19 15 | 1413 | 12 8 | 7 | 6 0 |
| Constant14 | Ta2 | Ra5 | Op2 | Rt5 | v | 0Fh7 |

|  |  |  |
| --- | --- | --- |
| Op2 |  | Operation |
| 0 | CSRR | Only read the CSR, no update takes place, Ra should be R0. |
| 1 | CSRW | Write to CSR |
| 2 | CSRS | Set CSR bits |
| 3 | CSRC | Clear CSR bits |

CSRS and CSRC operations are only valid on registers that support the capability.

The Regno[15..12] field is reserved to specify the operating mode. Note that registers cannot be accessed by a lower operating mode.

**Execution Units:** Integer, the instruction may be available on only a single execution unit (not supported on all available integer units).

**Clock Cycles**: 1

**Exceptions**: privilege violation attempting to access registers outside of those allowed for the operating mode.

### PEEK – Peek at Queue / Stack

**Description**:

This instruction returns the top value into Rt from the hardware queue specified in Ra. The hardware queue position is not advanced. Unused value bits should read as zero. Used the STAT instruction to get the queue status.

**Instruction Format**: PEEKQ

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 0Ah6 | 06 | Ra6 | Rt6 | 44h8 |

**Instruction Format**: PEEKQI

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 0Eh6 | 06 | Qno6 | Rt6 | 44h8 |

**Instruction Format**: PEEKQ

**Exceptions:** none

### PFI – Poll for Interrupt

**Description**:

The poll for interrupt instruction polls the interrupt status lines and performs an interrupt service if an interrupt is present. Otherwise, the PFI instruction is treated as a NOP operation. Polling for interrupts is performed by managed code. PFI provides a means to process interrupts at specific points in running software.

**Instruction Format: SYS**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 | 6 0 |
| 11h6 | 06 | 06 | 06 | v | 44h7 |

**Clock Cycles**: 1 (if no exception present)

**Execution Units: Branch**

### POP – Pop from Queue / Stack

**Description**:

This instruction pops a value into Rt from the hardware queue specified in Ra. The hardware queue position is advanced. Unused value bits should read as zero. To check the queue status, use the STAT instruction.

|  |
| --- |
| 63 0 |
| Value |

Value: the value that was pushed to the queue

**Instruction Format**: POP

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 09h6 | 06 | Ra6 | Rt6 | 44h8 |

**Instruction Format**: POPI

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 0Dh6 | 06 | Qno6 | Rt6 | 44h8 |

**Exceptions:** none

**Notes:**

Queue #15 is the instruction trace que

### PUSH – Push on Queue / Stack

**Description**:

This instruction pushes an N-bit value in Ra onto the hardware queue specified in Rb. Where N is implementation defined between 1 and 64 bits. To check the queue status, use the STATQ instruction.

**Instruction Format**: PUSH

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 08h6 | Rb6 | Ra6 | 06 | 44h8 |

**Instruction Format**: PUSHI

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 0Ch6 | Qno6 | Ra6 | 06 | 44h8 |

**Instruction Format**: PUSHQ

**Exceptions:** none

### REX – Redirect Exception

**Description**:

This instruction redirects an exception from an operating mode to a lower operating mode. This instruction if successful jumps to the target exception handler and does not return. If this instruction fails execution will continue with the next instruction.

This instruction may fail if exceptions are not enabled at the target level.

The location of the target exception handler is found in the trap vector register for that operating mode (tvec[xx]).

The cause (cause) and bad address (badaddr) registers of the originating mode are copied to the corresponding registers in the target mode.

**Instruction Format**: REX

|  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 59 | 58 56 | 55 48 | 47 44 | 4341 | 40 | 39 32 | 31 24 | 23 16 | 15 8 | 7 0 |
| ~ | Rm3 | 7Ah8 | Tm3 | m3 | z | Rc8 | Imm8 | Ra8 | 08 | 44h8 |

|  |  |
| --- | --- |
| Tm3 |  |
| 0 | redirect to user mode |
| 1 | redirect to supervisor mode |
| 2 | redirect to hypervisor mode |
| 3 | redirect to machine mode |
| 4 to 7 | not used |

**Clock Cycles**: 4

**Execution Units: Branch**

Example:

|  |
| --- |
| REX 1 ; redirect to supervisor handler  ; If the redirection failed, exceptions were likely disabled at the target level.  ; Continue processing so the target level may complete its operation.  RTE ; redirection failed (exceptions disabled ?) |

**Notes**:

Since all exceptions are initially handled in debug mode the debug handler must check for disabled lower mode exceptions.

### RTE – Return from Exception

**Description**:

Restore the previous interrupt enable setting and operating level and transfer program execution back to the address in the exception address register (EIP). One of sixty-four semaphore registers specified by the Rb field of the instruction may also be cleared. Semaphore register zero is always cleared by this instruction.

This instruction may be encoded to return a short distance past the exception address point. This may be useful to return to the next instruction or return to a point past inline parameters. The Ra field specifies a return offset in terms of instruction words.

There is really only a single instruction to return from any mode for an exception. Although there are several additional mnemonics.

**Instruction Format: SYS**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 13h6 | Rb6 | Ra6 | Opt6 | 44h8 |

Opt6[0]: 0 = Ra is reg spec, 1 = Ra is six-bit unsigned immediate

Opt6[1]: 0 = Rb is reg spec, 1 = Rb is six-bit unsigned immediate

**Flags Affected**: none

**Operation:**

PMSTACK = PMSTACK >> 4

Semaphore[0] = 0

Semaphore[Rb] = 0

IP = EIP + Ra

**Execution Units**: Branch

**Clock Cycles**:

**Exceptions**: none

**Notes**:

### STAT – Get Status of Queue / Stack

**Description**:

This instruction returns a queue status value into Rt from the hardware queue specified in Ra. The hardware queue position is not advanced. Unused value bits should read as zero.

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 63 | 62 | 61 54 | 53 48 | 47 0 | 9 0 |
| Qe | Dv | ~ | ~ | ~ | Data Count |

Fields

Qe: empty.If set, this bit indicates that the queue/stack is empty.

Dv: data valid. If this bit is set it indicates that valid data is present at the queue.

Dc: data count: The number of items left in the queue

**Instruction Format**: POP

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 0Bh6 | 06 | Ra6 | Rt6 | 44h8 |

**Instruction Format**: POPI

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 0Fh6 | 06 | Qno6 | Rt6 | 44h8 |

**Exceptions:** none

### SYNC -Synchronize

Description:

All instructions for a particular unit before the SYNC are completed and committed to the architectural state before instructions of the unit type after the SYNC are issued. This instruction is used to ensure that the machine state is valid before subsequent instructions are executed.

Instruction Format:

|  |  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 6361 | 60 58 | 57 50 | 4948 | 47 44 | 4341 | 40 | 39 32 | 31 24 | 23 16 | 15 8 | 7 0 |
| ~3 | Op3 | ??h8 | U2 | Sz4 | m3 | z | Rc8 | Rb8 | Ra8 | Rt8 | 44h8 |

### TLBRW – Read / Write TLB

**Description**:

This instruction both reads and writes the TLB. Which translation entry to update comes from the value in Ra. The update value comes from the value in Rb. Rb contains the virtual page number, ASID, and physical page number. The current value of the entry selected by Ra is copied to Rt. The TLB will be written only if bit 63 of Ra is set.

The entry number for Ra comes from virtual address bits 14 to 23.

Page numbers are in terms of a 16kB page size.

**Instruction Format: SYS**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 1Eh6 | Rb6 | Ra6 | Rt6 | 44h8 |

**Clock Cycles**: 5

**Execution Units: Memory**

Ra Value Format

|  |  |  |  |
| --- | --- | --- | --- |
| 63 | 62 12 | 11 10 | 9 0 |
| w | ~ | way | entry no |

Rb/Rt Value Format

|  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 63 56 | 55 | 54 | 53 | 52 48 | 47 32 | 31 20 | 19 0 |
| ASID | G | D | A | UCRWX | VPN | ~ | PPN |

|  |  |  |  |
| --- | --- | --- | --- |
| Bits |  | Meaning | |
| 0 to 19 | PPN | Physical page number | |
| 20 to 31 | ~ | reserved (expansion of physical page number) | |
| 32 to 49 | VPN | Virtual page number high address order bits 24 to 39 | |
| 48 | X | 1 = page is executable | These three combined indicate page present (P) 0 = not present |
| 49 | W | 1 = page is writeable |
| 50 | R | 1 = page is readable |
| 51 | C | 1 = page is cachable | |
| 52 | U | reserved for system usage | |
| 53 | A | Accessed, set if translation was used | |
| 54 | D | Dirty, set if a write occurred to the page | |
| 55 | G | Global, global translation indicator | |
| 56 to 63 | ASID | ASID address space identifier | |

**Exceptions:** none

### WFI – Wait for Interrupt

**Description**:

The WFI instruction waits for an external interrupt to occur before proceeding. While waiting for the interrupt, the processor clock is stopped placing the processor in a lower power mode.

**Instruction Format: SYS**

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 8 | 7 0 |
| 12h6 | 06 | 06 | 06 | 44h8 |

**Clock Cycles**: 1 (if no exception present)

**Execution Units: Branch**

# Vector Specific Instructions

### MFILL –Mask Fill

**Description**

Fill vector mask register with bits.

The first Ra bits of the vector mask register (Vmt) are set to one. The remaining bits of the mask register are set to zero.

**Instruction Format: R1**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 11 | 10 8 | 7 0 |
| 0Ch6 | ~6 | Ra6 | 03 | Vmt3 | 80h8 |

**Operation**

Vmt = 0

for x = 0 to VLMAX

if (x < Ra) Vmt[x] = 1

**Execution Units:** ALUs

### MFIRST – Find First Set Bit

**Description**

The position of the first bit set in the mask register is copied to the target register. If no bits are set the value is 128. The search begins at the least significant bit and proceeds to the most significant bit.

**Instruction Format: R1**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 17 | 16 14 | 13 8 | 7 0 |
| 0Eh6 | ~6 | 03 | Vm3 | Rt6 | 80h8 |

**Operation**

Rt = first set bit number of (Vm)

**Exceptions:** none

**Execution Units:** ALUs

### MFM – Move from Mask

**Description**

Move a mask register to a general-purpose register.

**Instruction Format: R1**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 17 | 16 14 | 13 8 | 7 0 |
| 11h6 | ~6 | 03 | Vm3 | Rt6 | 80h8 |

**Operation**

Vmt = Ra

**Execution Units:** ALUs

### MFVL – Move from Vector Length

**Description**

Move vector length register to a general-purpose register.

**Instruction Format: R1**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 17 | 16 14 | 13 8 | 7 0 |
| 13h6 | ~6 | 03 | 03 | Rt6 | 80h8 |

**Operation**

Vmt = Ra

**Execution Units:** ALUs

### MLAST – Find Last Set Bit

**Description**

The position of the last bit set in the mask register is copied to the target register. If no bits are set the value is 128. The search begins at the most significant bit of the mask register and proceeds to the least significant bit.

**Instruction Format: R1**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 17 | 16 14 | 13 8 | 7 0 |
| 0Fh6 | ~6 | 03 | Vm3 | Rt6 | 80h8 |

**Operation**

Rt = last set bit number of (Vm)

**Exceptions:** none

**Execution Units:** ALUs

### MTM – Move to Mask

**Description**

Move a general-purpose register to a mask register.

**Instruction Format: R1**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 11 | 10 8 | 7 0 |
| 10h6 | ~6 | Ra6 | 03 | Vmt3 | 80h8 |

**Operation**

Vmt = Ra

**Execution Units:** ALUs

### MTVL – Move to Vector Length

**Description**

Move a general-purpose register to the vector length register.

**Instruction Format: R1**

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 26 | 25 20 | 19 14 | 13 11 | 10 8 | 7 0 |
| 12h6 | ~6 | Ra6 | 03 | 03 | 80h8 |

**Operation**

Vmt = Ra

**Execution Units:** ALUs

## Arithmetic / Logical

### V2BITS

**Description**

Convert Boolean vector to bits. The least significant bit of each vector element is copied to the corresponding bit in the target register. The target register is a scalar register.

**Instruction Format: R1**

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 39 32 | 3128 | 27 25 | 24 | 2322 | 21 16 | 1514 | 13 8 | 7 0 |
| 18h8 | ~4 | m3 | z | 12 | Va6 | 02 | Rt6 | 81h8 |

**Operation**

For x = 0 to VL-1

if (Vm[x])

Rt[x] = Va[x].LSB

else if (z)

Rt[x] = 0

**Exceptions:** none

### VBITS2V

**Description**

Convert bits to Boolean vector. Bits from a general register are copied to the corresponding vector target register.

**Instruction Format: R1**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 26 | 2524 | 23 21 | 20 | 19 14 | 13 8 | 7 0 |
| 19h6 | ~2 | m3 | z | Ra6 | Vt6 | 81h8 |

**Operation**

For x = 0 to VL-1

if (Vm[x]) Vt[x] = Ra[x]

**Exceptions:** none

### VCIDX – Compress Index

**Description**

A value in a register Ra is multiplied by the element number and copied to elements of vector register Vt guided by a vector mask register.

**Instruction Format: R1**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 26 | 2524 | 23 21 | 20 | 19 14 | 13 8 | 7 0 |
| 2Dh6 | ~2 | m3 | z | Ra6 | Vt6 | 81h8 |

**Operation**

y = 0

for x = 0 to VL - 1

if (Vm[x])

Vt[y] = Ra \* x

y = y + 1

### VCMPRSS – Compress Vector

**Description**

Selected elements from vector register Va are copied to elements of vector register Vt guided by a vector mask register.

**Instruction Format: R1**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 26 | 2524 | 23 21 | 20 | 19 14 | 13 8 | 7 0 |
| 2Ch6 | ~2 | m3 | z | Va6 | Vt6 | 81h8 |

**Operation**

y = 0

for x = 0 to VL - 1

if (Vm[x])

Vt[y] = Va[x]

y = y + 1

### VEINS / VMOVSV – Vector Element Insert

**Synopsis**

Vector element insert.

**Description**

A general-purpose register Rb is transferred into one element of a vector register Vt. The element to insert is identified by Ra.

**Operation**

Vt[Ra] = Rb

Exceptions: none

### VEX / VMOVS – Vector Element Extract

**Synopsis**

Vector element extract.

**Description**

A vector register element from Vb is transferred into a general-purpose register Rt. The element to extract is identified by Ra.

**Operation**

Rt = Vb[Ra]

**Exceptions**: none

### VSCAN

**Synopsis**

.

**Description**

Elements of Vt are set to the cumulative sum of a value in register Ra. The summation is guided by a vector mask register.

**Operation**

sum = 0

for x = 0 to VL - 1

Vt[x] = sum

if (Vm[x])

sum = sum + Ra

### VSLLV – Shift Vector Left Logical

**Description**

Elements of the vector are transferred upwards to the next element position. The first is loaded with the value zero. This is also called a slide operation.

**Operation**

For x = VL-1 to Amt

Vt[x] = Va[x-amt]

For x = Amt-1 to 0

Vt[x] = 0

**Exceptions:** none

### VSRLV – Shift Vector Right Logical

**Description**

Elements of the vector are transferred downwards to the next element position. The last is loaded with the value zero. This is also called a slide operation.

**Operation**

For x = 0 to VL-Amt

Vt[x] = Va[x+amt]

For x = VL-Amt +1 to VL-1

Vt[x] = 0

**Exceptions:** none

## Memory Operations

### CVLDx – Compressed Vector Load

**Description**:

**Formats Supported**:

**Stridden Form (CVLDSx)**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 21 | 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| Const11 | I | Rc6 | A | m3 | z | 5Ch8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 28 | 27 20 | 19 14 | 13 8 | 7 0 |
| Func3..0 | Const8 | Ra6 | Vt6 | E6h8 |

Data is loaded from memory addresses separated by the stride amount specified by register field Rc, beginning with the sum of Ra and an immediate value. Rc may specify either a register or a six-bit unsigned constant. If the vector mask bit is clear and the ‘z’ bit is set in the instruction then the corresponding element of the vector register is loaded with zero. If the vector mask bit is clear and the ‘z’ bit is clear in the instruction then the corresponding element of the vector register is left unchanged (no value is loaded from memory).

Elements are loaded only up to the length specified in the vector length register.

***Operation:***

y = 0

for x = 0 to vector length

if Rb is a constant

stride = Rb

else

stride = [Rb]

n = stride \* y

if (Vm[x])

Vt[y] = Memory[d+Ra + n]

y = y + 1

for y = y to vector length

Vt[y] = z ? 0 : Vt[y]

n = 0

|  |  |  |
| --- | --- | --- |
| Vm[x] | z | Result |
| 0 | 0 | Vt[x] = Vt[x] (unchanged) |
| 0 | 1 | Vt[x] = 0 (set to zero) |
| 1 | 0 | Vt[x] = memory, sign extended |
| 1 | 1 | Vt[x] = memory, zero extended |

**Indexed Form (CVLDxVX)**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 21 | 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| Const11 | 0 | Vc6 | A | m3 | z | DCh8 |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 31 28 | 27 20 | 19 14 | 13 8 | 7 0 |
| Func3..0 | Const8 | Ra6 | Vt6 | E7h8 |

Data is loaded from memory addresses beginning with the sum of Ra and a vector element from Vc.

***Operation:***

y = 0

for x = 0 to vector length

if (Vm[x])

Vt[y] = Memory[d+Ra + Vc[x]]

y = y + 1

for y = y to vector length

Vt[y] = z ? 0 : Vt[y]

**Exceptions**: none

### CVSTx – Compressed Vector Store

**Description**:

**Formats Supported**:

**Stridden Form (CVSTSx)**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 21 | 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| Const11 | I | Rc6 | A | m3 | z | 5Ch8 |

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 28 | 2726 | 25 20 | 19 14 | 13 8 | 7 0 |
| Func3..0 | C2 | Vb6 | Ra6 | Cnst6 | F6h8 |

Data is stored to memory at addresses beginning with the sum of Ra and a vector element from Vb. The store location is adjusted by a stride amount contained in Rc or a six-bit unsigned immediate.

***Operation:***

y = 0

for x = 0 to vector length

n = Rc \* y

if (Vm[x])

Memory[d+Ra + n] = Vs[x]

if (z) Vs[x] = 0

y = y + 1

**Indexed Form (CVSTxVX)**

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| 31 21 | 20 | 19 14 | 1312 | 11 9 | 8 | 7 0 |
| Const11 | 0 | Vc6 | A | m3 | z | DCh8 |

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| 31 28 | 2726 | 25 20 | 19 14 | 13 8 | 7 0 |
| Func3..0 | C2 | Vb6 | Ra6 | Cnst6 | F7h8 |

Data is stored to memory addresses beginning with the sum of Ra and a vector element from Vb.

***Operation:***

y = 0

for x = 0 to vector length

if (Vm[x])

Memory[d+Ra + Vb[y]] = Vs[x]

if (z) Vs[x] = 0

y = y + 1

**Exceptions**: none

# Root Opcode Map

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
| **ALU** | | | | | | | | |
| 00000 | BRK | {R1} | {R2} | {R3/R4} | ADD | SUBF | MUL | {SYS} |
| 00001 | AND | OR | XOR |  |  | {SET} | MULU | CSR |
| 00010 | DIV | DIVU | DIVSU |  |  | MULF | MULSU | PERM |
| 00011 | REM | REMU | BYTNDX | WYDNDX | {BTFLD} |  |  |  |
| 00100 | REMSU | DIVR | CHK | U21NDX | SAND | SOR | SEQ | SNE |
| 00101 | SLT | SGT | SLTU | SGTU |  |  |  |  |
| 00110 | {DF1} | {DF2} | {DF3} | {DF4} | {F1} | {F2} | {F3} | {F4} |
| 00111 | {PST1} | {PST2} | {PST3} | {PST4} |  |  | {VM} | NOP |
| **Branch Unit** | | | | | | | | |
| 01000 | FBLT | FBGE | DFBLT | DFBGE |  |  | FBEQ | FBNE |
| 01001 | BLT | BGE | BLTU | BGEU |  | BBS | BEQ | BNE |
| **Instruction Modifiers (Prefixes)** | | | | | | | | |
| 01010 | EXI | EXI | EXI |  |  |  |  |  |
| 01011 | IMOD | BTFLD | BRMOD |  | STRIDE |  |  |  |
| **Memory Unit** | | | | | | | | |
| 01100 | LDx | LDxX |  |  | LDxZ | LDxXZ |  |  |
| 01101 |  |  |  |  |  |  |  | LSM |
| 01110 | STx | STxX |  |  |  |  |  |  |
| 01111 | JAL | BAL | JALR |  |  |  |  |  |
| **Vector ALU** | | | | | | | | |
| 10000 |  | {R1} | {R2} | {R3} | ADD | SUBF | MUL |  |
| 10001 | AND | OR | XOR |  |  | {SET} | MULU |  |
| 10010 | DIV | DIVU | DIVSU |  |  | MULF | MULSU | PERM |
| 10011 | REM | REMU | BYTNDX | WYDNDX | {BTFLD} |  |  |  |
| 10100 | REMSU | DIVR | CHK | U21NDX |  |  | SEQ | SNE |
| 10101 | SLT | SGT | SLTU | SGTU |  |  |  |  |
| 10110 | {DF1} | {DF2} | {DF3} | {DF4} | {F1} | {F2} | {F3} | {F4} |
| 10111 | {PST1} | {PST2} | {PST3} | {PST4} |  |  |  | NOP |
| 11000 |  |  |  |  |  |  |  |  |
| 11001 |  |  |  |  |  |  |  |  |
| 11010 |  |  |  |  |  |  |  |  |
| 11011 | IMOD | BTFLD | BRMOD |  | STRIDE |  |  |  |
|  | | | | | | | | |
| 11100 |  |  | LDSx | LDxVX |  |  | CVLDSx | CVLDxVX |
| 11101 |  |  |  |  |  |  |  |  |
| 11110 |  |  | STSx | STxVX |  |  | CVSTSx | CVSTxVX |
| 11111 |  |  |  |  |  |  |  |  |

## {R1} Integer Monadic Register Ops – Func10

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
| xxxx000 | CNTLZ | CNTLO | CNTPOP | COM | NOT | NEG | ABS | NABS |
| xxxx001 | SQRT |  |  | TST | ZXB | ZXW | ZXT |  |
| xxxx010 | PTRINC | TRANSFORM |  |  | SXB | SXW | SXT |  |
| xxxx011 | V2BITS | BITS2V |  |  | VCMPRSS | VCIDX | VSCAN |  |
| xxxx100 |  |  |  |  |  |  |  |  |
| xxxx101 |  |  |  |  |  |  |  |  |
| xxxx110 |  |  |  |  |  |  |  |  |
| xxxx111 |  |  |  |  |  |  |  |  |

## {R2} Integer Dyadic Register Ops – Func7

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
| x000 | AND | OR | XOR |  | ADD | SUB | MUL |  |
| x001 | NAND | NOR | XNOR |  |  | MULF | MULU | MULH |
| x010 | DIV | DIVU | DIVSU | REM | REMU | REMSU | MULSU | PERM |
| x011 | DIF | SLL | SLLI | WYDNDX | MULF | MULSUH | MULUH | RGF |
| x100 | CMP | SRL | SRLI | U21NDX |  |  | SEQ | SNE |
| x101 | MIN | MAX |  |  | SLT | SGE | SLTU | SGEU |
| x110 | BMM.or | BMM.xor | BMM | BMM |  |  |  |  |
| x111 | VSLLV | VSLRV | VEX | VEINS |  |  | RD\_COEFF | WR\_COEFF |

## {R3/R4} Triadic Register Ops

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
| x000 |  |  |  |  | MUX |  |  |  |
| x001 |  |  |  |  |  |  |  |  |
| x010 | SLLP | SLLPI |  |  |  |  |  |  |
| x011 | PTRDIF |  |  |  |  |  |  |  |
| x100 |  |  | CHK |  |  |  |  |  |
| x101 |  |  |  |  |  |  |  |  |
| x110 | BLEND |  |  |  |  |  |  | FDP |
| x111 |  |  |  |  |  |  |  |  |

## {F1} Floating-Point Monadic Ops – Funct7

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
| x000 | FMOV | FRSQRTE | FTOI | ITOF |  |  | FSIGN | FMAN |
| x001 | FSQRT | FS2D | FS2Q | FD2Q | FSTAT |  | ISNAN | FINITE |
| x010 | FTX | FCX | FEX | FDX | FRM | TRUNC | FSYNC | FRES |
| x011 | FSIGMOID | FD2S | FQ2S | FQ2D |  |  | FCLASS | UNORD |
| x100 | FABS | FNABS | FNEG |  |  |  |  |  |
| x101 |  |  |  |  |  |  |  |  |
| x110 |  |  |  |  |  |  |  |  |
| x111 |  |  |  |  |  |  |  |  |

## {F2} Floating-Point Dyadic Ops – Funct7

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
| x000 | SCALEB |  | FMIN | FMAX | FADD | FSUB |  |  |
| x001 | FMUL | FDIV | FREM | FNXT | FAND | FOR |  |  |
| x010 | FCMP | FSEQ | FSLT | FSLE | FSNE | FCMPB | FSETM |  |
| x011 | CPYSGN | SGNINV | SGNAND | SGNOR | SGNXOR | SGNXNOR | FCLASS |  |
| x100 |  |  |  |  |  |  |  |  |
| x101 |  |  |  |  |  |  |  |  |
| x110 |  |  |  |  |  |  |  |  |
| x111 |  |  |  |  |  |  |  |  |

## {F3} Floating-Point Dyadic Ops – Funct7

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
| x000 | FMA | FMS | FNMA | FNMS |  |  |  |  |
| x001 |  |  |  |  |  |  |  |  |
| x010 |  |  |  |  |  |  |  |  |
| x011 |  |  |  |  |  |  |  |  |
| x100 |  |  |  |  |  |  |  |  |
| x101 |  |  |  |  |  |  |  |  |
| x110 |  |  |  |  |  |  |  |  |
| x111 |  |  |  |  |  |  |  |  |

## {VM} Vector Mask Register Ops

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
| x000 | MAND | MOR | MXOR |  | MADD | SUB | MSLL | MSRL |
| x001 | MNAND | MNOR | MXNOR |  | MFILL | MPOP | MFIRST | MLAST |
| x010 | MTM | MFM | MTVL |  |  |  |  |  |
| x011 |  |  |  |  |  |  |  |  |
| x100 |  |  |  |  |  |  |  |  |
| x101 |  |  |  |  |  |  |  |  |
| x110 |  |  |  |  |  |  |  |  |
| x111 |  |  |  |  |  |  |  |  |

{OSR2} System Ops

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | 000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
| x000 | LLAL | LLAH |  |  | LPAL | LPAH |  |  |
| x001 | PUSHQ | POPQ | PEEKQ | STATQ |  | POPQI | PEEKQI | STATQI |
| x010 | REX | PFI | WAI | RTE | SETKEY |  |  |  |
| x011 | SETTO | GETTO | GETZL |  |  | MVSEG | TLBRW | SYNC |
| x100 |  |  |  |  |  |  |  |  |
| x101 |  |  |  |  |  |  |  |  |
| x110 |  |  |  |  |  |  |  |  |
| x111 |  |  |  |  |  |  |  |  |